• 周六. 10 月 5th, 2024

5G编程聚合网

5G时代下一个聚合的编程学习网

热门标签

Discussion on expansion opportunity and threshold of HashMap (JDK1.8)

King Wang

1 月 3, 2022

Recently, I began to review some basic knowledge , Find out a lot about HashMap There are some problems in the summary of , In particular, there are still some questions about the timing and threshold of capacity expansion , So decisive or look at the source .

This paper only points out some problems about capacity expansion , Other detailed analysis can refer to other articles .

When is the capacity expansion

  • When put They found table When not initialized , Initial capacity expansion
  • When put After joining the node , Find out size( Number of key-value pairs )>threshold when , super-popular

In the following source code respectively corresponding to the notes

 final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length; // Initialize expansion
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize(); // Over threshold capacity expansion
afterNodeInsertion(evict);
return null;
}

How is the threshold determined

In fact, the key is to determine the threshold , Most articles on the Internet only say that the threshold is load factor*current capacity, This conclusion is correct , But the whole process is different depending on the constructor used .
Next we only analyze the relationship between threshold and capacity .
Here is resize Part of the code

 final Node<K,V>[] resize() {
Node<K,V>[] oldTab = table;
int oldCap = (oldTab == null) ? 0 : oldTab.length;
int oldThr = threshold;
int newCap, newThr = 0;
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
threshold = newThr;
/*** It is omitted here table The changing process of the watch ***/
}

Parameter free constructor

public HashMap() {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

You can see that the parameterless constructor here only initializes the load factor to the default value 0.75.
== Be careful , here hashmap Of table,threshold It’s not initialized yet , That is, the capacity at this time is 0, The threshold is also 0.==
And for the first time put when , Check table==null, For the first time resize
 Insert picture description here
At this time in resize Will go into the following branches
 Insert picture description here
So capacity cap As the default value 16, threshold threshold by 16*0.75=12.
And in every subsequent expansion , Both capacity and threshold are doubled .
That is to say, both of them are still 0.75 The proportion of .

Constructor with parameters

 public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
 public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
this.threshold = tableSizeFor(initialCapacity);
}

In addition to the above code parameters in addition to the legitimate judgment , Mainly set load factor and threshold , And the threshold is tableSizeFor() The function determines .

 /**
* Returns a power of two size for the given target capacity.
*/
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

You can see the comments , threshold threshold Will be set closest to the given parameter 2n Value .
Be careful , At this point, only the threshold value is set ,table It’s not initialized yet , Capacity cap Also for 0.

It’s also in put when , For the first time , And the expansion branches are as follows
 Insert picture description here
You can see , Capacity cap Set to the threshold just calculated 2n, And the threshold is reset to cap*loadFactor. Although the two still hold loadFactor The proportion of , But first set the threshold , And then assigned to the capacity , Then the threshold value is recalculated according to the capacity and load factor . In the process , Threshold takes on the role of an intermediate variable .

And in the subsequent expansion process , Capacity and threshold still keep the proportion of load factor , And at the same time become twice the original .

发表回复