Fork me on GitHub

数据结构-集合和映射

Set

不能存放重复元素
接口方法

1
2
3
4
5
6
7
8
public interface Set<E> {

void add(E e);
boolean contains(E e);
void remove(E e);
int getSize();
boolean isEmpty();
}

二分搜索树实现

借助前面的二分搜索树,可以很轻松的实现Set

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public class BSTSet<E extends Comparable<E>> implements Set<E> {

private BST<E> bst;

public BSTSet() {
bst = new BST<>();
}

@Override
public int getSize() {
return bst.size();
}

@Override
public boolean isEmpty() {
return bst.isEmpty();
}

@Override
public void add(E e) {
bst.add(e);
}

@Override
public boolean contains(E e) {
return bst.contains(e);
}

@Override
public void remove(E e) {
bst.remove(e);
}
}

链表实现

使用前面创建的LinkedListSet来实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
public class LinkedListSet<E> implements Set<E> {

private LinkedList<E> list;

public LinkedListSet() {
list = new LinkedList<>();
}

@Override
public int getSize() {
return list.getSize();
}

@Override
public boolean isEmpty() {
return list.isEmpty();
}

@Override
public void add(E e) {
if (!list.contains(e))
list.addFirst(e);
}

@Override
public boolean contains(E e) {
return list.contains(e);
}

@Override
public void remove(E e) {
list.removeElement(e);
}

}

时间复杂度分析

编写一个main函数进行测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import java.util.ArrayList;

public class Main {

private static double testSet(Set<String> set, String filename){

long startTime = System.nanoTime();

System.out.println(filename);
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile(filename, words)) {
System.out.println("Total words: " + words.size());

for (String word : words)
set.add(word);
System.out.println("Total different words: " + set.getSize());
}
long endTime = System.nanoTime();

return (endTime - startTime) / 1000000000.0;
}

public static void main(String[] args) {

String filename = "pride-and-prejudice.txt";

BSTSet<String> bstSet = new BSTSet<>();
double time1 = testSet(bstSet, filename);
System.out.println("BST Set: " + time1 + " s");

System.out.println();

LinkedListSet<String> linkedListSet = new LinkedListSet<>();
double time2 = testSet(linkedListSet, filename);
System.out.println("Linked List Set: " + time2 + " s");

}
}

结果

1
2
3
4
5
6
7
8
9
pride-and-prejudice.txt
Total words: 125901
Total different words: 6530
BST Set: 0.1498369 s

pride-and-prejudice.txt
Total words: 125901
Total different words: 6530
Linked List Set: 3.2225657 s
LinkedListSetBSTSet
增 addO(n)O(h)
查 containsO(n)O(h)
删 removeO(n)O(h)

h表示树的深度, h和n的关系如下:
假设树为满二叉树, 根节点所在的那一层算是第0层

层数节点数
0层1
1层2
2层4
3层8
4层16
h-1层2^(h-1)

这样算下来的话, h层, 共有 2^h-1个结点,
2^h -1 = n ==> h=log₂(n+1)=O(log₂n)=O(logn)

这样算下来,时间复杂度应该是:

LinkedListSetBSTSet 最优
增 addO(n)O(h) O(logn)
查 containsO(n)O(h) O(logn)
删 removeO(n)O(h) O(logn)

如果二叉树退化成链表,那么时间复杂度就和链表集合一样了

LinkedListSetBSTSet - 最优 - 最坏
增 addO(n)O(h) - O(logn) - O(n)
查 containsO(n)O(h) - O(logn) - O(n)
删 removeO(n)O(h) - O(logn) - O(n)

Map

存储(键, 值)数据对的数据结构, 根据键去寻找值, 可以很轻松的使用链表或者二分搜索树实现

1
2
3
4
5
6
7
8
9
10
11
12
class Node{
K key;
V value;
Node next;
}

class Node{
K key;
V value;
Node left;
Node right;
}

映射的接口类

1
2
3
4
5
6
7
8
9
public interface Map<K, V> {
void add(K key, V value);
V remove(K key);
boolean contains(K key);
V get(K key);
void set(K key, V newValue);
int getSize();
boolean isEmpty();
}

链表实现

需要对节点内容进行修改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
public class LinkedListMap<K, V> implements Map<K, V> {

private class Node{
public K key;
public V value;
public Node next;

public Node(K key, V value, Node next){
this.key = key;
this.value = value;
this.next = next;
}

public Node(K key, V value){
this(key, value, null);
}

public Node(){
this(null, null, null);
}

@Override
public String toString(){
return key.toString() + " : " + value.toString();
}
}

private Node dummyHead;
private int size;

public LinkedListMap(){
dummyHead = new Node();
size = 0;
}

@Override
public int getSize(){
return size;
}

@Override
public boolean isEmpty(){
return size == 0;
}

private Node getNode(K key){
Node cur = dummyHead.next;
while(cur != null){
if(cur.key.equals(key))
return cur;
cur = cur.next;
}
return null;
}

@Override
public boolean contains(K key){
return getNode(key) != null;
}

@Override
public V get(K key){
Node node = getNode(key);
return node == null ? null : node.value;
}

@Override
public void add(K key, V value){
Node node = getNode(key);
if(node == null){
dummyHead.next = new Node(key, value, dummyHead.next);
size ++;
}
else
node.value = value;
}

@Override
public void set(K key, V newValue){
Node node = getNode(key);
if(node == null)
throw new IllegalArgumentException(key + " doesn't exist!");

node.value = newValue;
}

@Override
public V remove(K key){

Node prev = dummyHead;
while(prev.next != null){
if(prev.next.key.equals(key))
break;
prev = prev.next;
}

if(prev.next != null){
Node delNode = prev.next;
prev.next = delNode.next;
delNode.next = null;
size --;
return delNode.value;
}

return null;
}
}

二分搜索树实现

修改一下节点内容, 其他的都可以复用二分搜索树中的方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170

public class BSTMap<K extends Comparable<K>, V> implements Map<K, V> {

private class Node {
public K key;
public V value;
public Node left, right;

public Node(K key, V value) {
this.key = key;
this.value = value;
left = null;
right = null;
}
}

private Node root;
private int size;

public BSTMap() {
root = null;
size = 0;
}

@Override
public int getSize() {
return size;
}

@Override
public boolean isEmpty() {
return size == 0;
}

// 向二分搜索树中添加新的元素(key, value)
@Override
public void add(K key, V value) {
root = add(root, key, value);
}

// 向以node为根的二分搜索树中插入元素(key, value),递归算法
// 返回插入新节点后二分搜索树的根
private Node add(Node node, K key, V value) {

if (node == null) {
size++;
return new Node(key, value);
}

if (key.compareTo(node.key) < 0)
node.left = add(node.left, key, value);
else if (key.compareTo(node.key) > 0)
node.right = add(node.right, key, value);
else // key.compareTo(node.key) == 0
node.value = value;

return node;
}

// 返回以node为根节点的二分搜索树中,key所在的节点
private Node getNode(Node node, K key) {

if (node == null)
return null;

if (key.equals(node.key))
return node;
else if (key.compareTo(node.key) < 0)
return getNode(node.left, key);
else // if(key.compareTo(node.key) > 0)
return getNode(node.right, key);
}

@Override
public boolean contains(K key) {
return getNode(root, key) != null;
}

@Override
public V get(K key) {

Node node = getNode(root, key);
return node == null ? null : node.value;
}

@Override
public void set(K key, V newValue) {
Node node = getNode(root, key);
if (node == null)
throw new IllegalArgumentException(key + " doesn't exist!");

node.value = newValue;
}

// 返回以node为根的二分搜索树的最小值所在的节点
private Node minimum(Node node) {
if (node.left == null)
return node;
return minimum(node.left);
}

// 删除掉以node为根的二分搜索树中的最小节点
// 返回删除节点后新的二分搜索树的根
private Node removeMin(Node node) {

if (node.left == null) {
Node rightNode = node.right;
node.right = null;
size--;
return rightNode;
}

node.left = removeMin(node.left);
return node;
}

// 从二分搜索树中删除键为key的节点
@Override
public V remove(K key) {

Node node = getNode(root, key);
if (node != null) {
root = remove(root, key);
return node.value;
}
return null;
}

private Node remove(Node node, K key) {

if (node == null)
return null;

if (key.compareTo(node.key) < 0) {
node.left = remove(node.left, key);
return node;
} else if (key.compareTo(node.key) > 0) {
node.right = remove(node.right, key);
return node;
} else { // key.compareTo(node.key) == 0

// 待删除节点左子树为空的情况
if (node.left == null) {
Node rightNode = node.right;
node.right = null;
size--;
return rightNode;
}

// 待删除节点右子树为空的情况
if (node.right == null) {
Node leftNode = node.left;
node.left = null;
size--;
return leftNode;
}

// 待删除节点左右子树均不为空的情况
// 找到比待删除节点大的最小节点, 即待删除节点右子树的最小节点
// 用这个节点顶替待删除节点的位置
Node successor = minimum(node.right);
successor.right = removeMin(node.right);
successor.left = node.left;

node.left = node.right = null;

return successor;
}
}
}

时间复杂度分析

编写一个main函数进行测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import java.util.ArrayList;

public class Main {

private static double testMap(Map<String, Integer> map, String filename){

long startTime = System.nanoTime();

System.out.println(filename);
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile(filename, words)) {
System.out.println("Total words: " + words.size());

for (String word : words){
if(map.contains(word))
map.set(word, map.get(word) + 1);
else
map.add(word, 1);
}

System.out.println("Total different words: " + map.getSize());
System.out.println("Frequency of PRIDE: " + map.get("pride"));
System.out.println("Frequency of PREJUDICE: " + map.get("prejudice"));
}

long endTime = System.nanoTime();

return (endTime - startTime) / 1000000000.0;
}

public static void main(String[] args) {

String filename = "pride-and-prejudice.txt";

BSTMap<String, Integer> bstMap = new BSTMap<>();
double time1 = testMap(bstMap, filename);
System.out.println("BST Map: " + time1 + " s");

System.out.println();

LinkedListMap<String, Integer> linkedListMap = new LinkedListMap<>();
double time2 = testMap(linkedListMap, filename);
System.out.println("Linked List Map: " + time2 + " s");

}
}

运行结果

1
2
3
4
5
6
7
8
9
10
11
12
13
pride-and-prejudice.txt
Total words: 125901
Total different words: 6530
Frequency of PRIDE: 53
Frequency of PREJUDICE: 11
BST Map: 0.1996936 s

pride-and-prejudice.txt
Total words: 125901
Total different words: 6530
Frequency of PRIDE: 53
Frequency of PREJUDICE: 11
Linked List Map: 14.6253311 s

时间复杂度和集合类似

LinkedListSetBSTSet - 最优 - 最坏
增 addO(n)O(h) - O(logn) - O(n)
删 removeO(n)O(h) - O(logn) - O(n)
改 setO(n)O(h) - O(logn) - O(n)
查 getO(n)O(h) - O(logn) - O(n)
查 containsO(n)O(h) - O(logn) - O(n)

对比

完整代码

-------------本文结束感谢您的阅读-------------
坚持原创技术分享,您的支持将鼓励我继续创作!
0%