Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
406 views
in Technique[技术] by (71.8m points)

collections - Regarding HashMap implementation in java

I was trying to do research on hashmap and came up with the following analysis:

https://stackoverflow.com/questions/11596549/how-does-javas-hashmap-work-internally/18492835#18492835

Q1 Can you guys show me a simple map where you can show the process..that how hashcode for the given key is calculated in detail by using this formula ..Calculate position hash % (arrayLength-1)) where element should be placed(bucket number), let say I have this hashMap

HashMap map=new HashMap();//HashMap key random order.
         map.put("Amit","Java");
         map.put("Saral","J2EE");

Q2 Sometimes it might happen that hashCodes for 2 different objects are the same. In this case 2 objects will be saved in one bucket and will be presented as LinkedList. The entry point is more recently added object. This object refers to other objest with next field and so one. Last entry refers to null. Can you guys show me this with real example..!!

.

"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7. how this is possible please explanin detail the whole calculation..?

Snapshots updated...

enter image description here

and the other image is ...

enter image description here

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

that how hashcode for the given key is calculated in detail by using this formula

In case of String this is calculated by String#hashCode(); which is implemented as follows:

 public int hashCode() {
    int h = hash;
        int len = count;
    if (h == 0 && len > 0) {
        int off = offset;
        char val[] = value;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }

Basically following the equation in the java doc

 hashcode = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

One interesting thing to note on this implementation is that String actually caches its hash code. It can do this, because String is immutable.

If I calculate the hashcode of the String "Amit", it will yield to this integer:

System.out.println("Amit".hashCode());
>     2044535

Let's get through a simple put to a map, but first we have to determine how the map is built. The most interesting fact about a Java HashMap is that it always has 2^n buckets. So if you call it, the default number of buckets is 16, which is obviously 2^4.

Doing a put operation on this map, it will first get the hashcode of the key. There happens some fancy bit twiddeling on this hashcode to ensure that poor hash functions (especially those that do not differ in the lower bits) don't "overload" a single bucket.

The real function that is actually responsible for distributing your key to the buckets is the following:

 h & (length-1); // length is the current number of buckets, h the hashcode of the key

This only works for power of two bucket sizes, because it uses & to map the key to a bucket instead of a modulo.

"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7.

Now that we have an index for it, we can find the bucket. If the bucket contains elements, we have to iterate over them and replace an equal entry if we find it. If none item has been found in the linked list we will just add it at the beginning of the linked list.

The next important thing in HashMap is the resizing, so if the actual size of the map is above over a threshold (determined by the current number of buckets and the loadfactor, in our case 16*0.75=12) it will resize the backing array. Resize is always 2 * the current number of buckets, which is guranteed to be a power of two to not break the function to find the buckets.

Since the number of buckets change, we have to rehash all the current entries in our table. This is quite costly, so if you know how many items there are, you should initialize the HashMap with that count so it does not have to resize the whole time.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...