C語言的HashTable簡(jiǎn)單實(shí)現(xiàn)

心不留意外塵 2016-05-18

展開全文

http://blog.csdn.net/zmxiangde_88/article/details/8025541

2012

HashTable是在實(shí)際應(yīng)用中很重要的一個(gè)結(jié)構(gòu)，下面討論一個(gè)簡(jiǎn)單的實(shí)現(xiàn)，雖然簡(jiǎn)單，但是該有的部分都還是有的。

一，訪問接口

創(chuàng)建一個(gè)hashtable.

hashtable hashtable_new(int size) // size表示包含的接點(diǎn)個(gè)數(shù)。

存入key-value至hashtable中。

void hashtable_put(hashtable h,const char* key,void *val);

根據(jù)key從hashtable中取出value值。

void * hashtable_get(hashtable h,const char *key);

釋放hashtable。

void hashtable_free(hashtable h);

釋放單個(gè)hash 接點(diǎn)

void hashtable_delete_node(hashtable h, const char *key);

二，數(shù)據(jù)結(jié)構(gòu)

hash接點(diǎn)的結(jié)構(gòu)：

typedef struct hashnode_struct{
      struct hashnode_struct *next;
      const char *key;
      void *val;
}*hashnode,_hashnode;

這個(gè)結(jié)構(gòu)還是很容易理解的，除了必須的key-value之外，包含一個(gè)用于沖突的鏈表結(jié)構(gòu)。

hashtable的數(shù)據(jù)結(jié)構(gòu)：

typedef struct hashtable_struct{
     pool_t p;
     int size;
     int count;
     struct hashnode_struct *z;
}*hashtable,_hashtable;

對(duì)這個(gè)結(jié)構(gòu)說明如下：

pool_t：內(nèi)存池結(jié)構(gòu)管理hashtable使用的內(nèi)存。結(jié)構(gòu)參考"

size：當(dāng)前hash的接點(diǎn)空間大小。

count：用于表示當(dāng)前接點(diǎn)空間中可用的hash接點(diǎn)個(gè)數(shù)。

z：用于在接點(diǎn)空間中存儲(chǔ)接點(diǎn)。

三，創(chuàng)建hashtable

代碼如下：

hashtable hashtable_new(int size)
{
    hashtable ht;
    pool_t p;

    p = _pool_new_heap(sizeof(_hashnode)*size + sizeof(_hashtable));
    ht= pool_malloc(p, sizeof(_hashtable));
    ht->size = size;
    ht->p = p;
    ht->z = pool_malloc(p, sizeof(_hashnode)*prime);
    return ht;
}

這個(gè)函數(shù)比較簡(jiǎn)單，先定義并初始化一個(gè)內(nèi)存池，大小根據(jù)size而定，所以在實(shí)際使用時(shí)，我們的size應(yīng)該要分配的相對(duì)大點(diǎn)，比較好。

四，存入key-value值

在這個(gè)操作之前，先要定義一個(gè)根據(jù)KEY值計(jì)算hashcode的函數(shù)。

static int hashcode(const char *s, int len)
{
    const unsigned char *name = (const unsigned char *)s;
    unsigned long h = 0, g;
    int i;

    for(i=0;i<len;i++)
    {
        h = (h << 4) + (unsigned long)(name[i]); //hash左移4位，當(dāng)前字符ASCII存入hash  
        if ((g = (h & 0xF0000000UL))!=0)     
            h ^= (g >> 24);
        h &= ~g;  //清空28-31位。 
    }

    return (int)h;
}

這個(gè)函數(shù)采用精典的ELF hash函數(shù)。

代碼如下：

void hashtable_put(hashtable h, const char *key, void *val)
{
    if(h == NULL || key == NULL) 
	return;

    int len = strlen(key);
    int index = hashcode(key,len);
    hashtable node;
    h->dirty++;

    if((node = hashtable_node_get(h, key,len, index)) != NULL)  //如果已經(jīng)存在，就替換成現(xiàn)在的值，因?yàn)楝F(xiàn)在的比較新。
    {
        n->key = key;
        n->val = val;
        return;
    }

    node = hashnode_node_new(h, index);  // 新建一個(gè)HASH NODE接點(diǎn)。
    node->key = key;
    node->val = val;
}

hashtable_node_get用于查找該KEY是否在HASH中已經(jīng)存在，實(shí)現(xiàn)很簡(jiǎn)單，如下：

static hashnode hashtable_node_get(hashtable h, const char *key, int len, int index)
{
    hashnode node;
    int i = index % h->size;
    for(node = &h->z[i]; node != NULL; node = node->next) // 在index值 [HASH值] 所對(duì)應(yīng)的HASH桶上遍歷尋找
        if(node->key != NULL && (strlen(node->key)==len) && (strncmp(key, node->key, len) == 0))
            return node;
    return NULL;
}

新建一個(gè)HASH NODE接點(diǎn)如下：

static hashnode hashnode_node_new(hashtable h, int index)
{
    hashnode node;
    int i = index % h->size;

    h->count++;

    for(node = &h->z[i]; node != NULL; node = node->next)
        if(node->key == NULL)   //這里的處理是：如果在HASH桶中存在某個(gè)值，KEY是空的，表明這個(gè)值已經(jīng)沒有用了，就用它來替換為現(xiàn)在準(zhǔn)備寫入的新接點(diǎn)。
            return node;

    node = pool_malloc(h->p, sizeof(_hashnode)); // 新建一個(gè)接點(diǎn)
    node->next = h->z[i].next;   // 加入到桶中，就是加到鏈表的第一個(gè)接點(diǎn)。
    h->z[i].next = node;
    return node;
}

五，從HASHTABLE中獲取接點(diǎn)

根據(jù)KEY從hashtable中獲取接點(diǎn)，步驟是先根據(jù)KEY計(jì)算hash值，然后從hashtable中找到指定的接點(diǎn)或者接點(diǎn)鏈表。如下：

void *hashtable_get(hashtable h, const char *key)
{
    if(h == NULL || key == NULL) 
	return NULL;
    hashnode node;
    int len = strlen(key);
    if(h == NULL || key == NULL || len <= 0 || (node = hashtable_node_get(h, key, len, hashcode(key,len))) == NULL)
    {
        return NULL;
    }
    return node->val;
}

這個(gè)函數(shù)就很容易理解了。

六，釋放HASHTABLE

hashtable的釋放就比較簡(jiǎn)單了，因?yàn)槲覀兯械膬?nèi)存申請(qǐng)都在內(nèi)存池上完成的，就只需要釋放內(nèi)存池，如下：

void hashtable_free(hashtable h)
{
    if(h != NULL)
        pool_free(h->p);
}

七，釋放單個(gè)hash接點(diǎn)

代碼如下：

void hashtable_delete_node(hashtable h, const char *key)
{
    if(h == NULL || key == NULL) 
	return;
    hashnode node;
    int len = strlen(key);
    if(h == NULL || key == NULL || (node = hashtable_node_get(h, key, len, hashcode(key,len))) == NULL)  //沒有這個(gè)接點(diǎn)
        return;

    node->key = NULL;
    node->val = NULL;

    h->count--;
}

這個(gè)就實(shí)現(xiàn)了一個(gè)簡(jiǎn)單的HASHTABLE結(jié)構(gòu)，當(dāng)然后還是有不足的，比如遍歷HASHTABLE，如果用數(shù)組的方式來遍歷，效率肯定很低，下面討論一種實(shí)現(xiàn)方案，用于遍歷hashtable.

八，hashtable的遍歷討論

直接用數(shù)組，就是hashtable中的struct hashnode_struct數(shù)組是可以遍歷，但如果只包含一個(gè)接點(diǎn)，也要遍歷所有的數(shù)組，如下遍歷：

void hashtable_traverse(hashtable h)
{
    int i;
    hashnode node;
    if(h == NULL)
        return;
    for(i = 0; i < h->prime; i++)
        for(node = &h->z[i]; node != NULL; node = node->next)
            if(node->key != NULL && node->val != NULL)
                XXXXXXXXXXXXXXXXX  // 這里是一些操作。
}

這樣效率很低，其實(shí)在接點(diǎn)中包含了next域，可以用這個(gè)來實(shí)現(xiàn)遍歷。

需要對(duì)前面hashtable數(shù)據(jù)結(jié)構(gòu)做簡(jiǎn)單的改動(dòng)，增加兩個(gè)域：

typedef struct hashtable_struct{
     pool_t p;
     int size;
     int count;
     struct hashnode_struct *z;
     int bucket;
     hashnode node;
}*hashtable,_hashtable;

就是增加了bucket和node兩個(gè)域，加這兩個(gè)域的思路是這樣的：

node表示當(dāng)前遍歷的游標(biāo)，在遍歷過程中，不斷的移動(dòng)這個(gè)接點(diǎn)所指向的接點(diǎn)。
bucket是和node相關(guān)聯(lián)的，用于記錄當(dāng)前的node在哪個(gè)桶上。

首先建立連接，就是將所有的接點(diǎn)都連接起來，按照慣例，也采用XXX_iter_first函數(shù)，先初始化，如下：

int hashtable_iter_first(hashtable h) {
    if(h == NULL) 
	return 0;
    h->bucket = -1;
    h->node = NULL;
    return hashtable_iter_next(h);
}

hashtable_iter_next用于獲取下一個(gè)接點(diǎn)，如果這時(shí)游標(biāo)已經(jīng)確定，那下一個(gè)接點(diǎn)就會(huì)被很快的被確定，定義如下：

int xhash_iter_next(xht h) {
    if(h == NULL) return 0;
    while(h->node != NULL) {
        h->node = h->node->next;   // 移向下一個(gè)接點(diǎn)，如果接點(diǎn)合法，返回成功
        if(h->node != NULL && h->node->key != NULL && h->node->val != NULL)
            return 1;
    }
    for(h->bucket++; h->bucket < h->prime; h->bucket++) {
        h->node = &h->z[h->bucket];

        while(h->node != NULL) {
            if(h->node->key != NULL && h->node->val != NULL)
                return 1;
            h->node = h->node->next;
        }
    }
    h->bucket = -1;  // 不存在下一個(gè)接點(diǎn)。
    h->node = NULL;
    return 0;
}

有了上面兩個(gè)方法之后，遍歷操作如下：

hashtable ht
if(hashtable_iter_first(ht))   //取第一個(gè)接點(diǎn)。 
do{
    // 此時(shí)可以處理ht->node，表示當(dāng)前的接點(diǎn)。
}while(hashtable_iter_next(ht));  //取下一個(gè)接點(diǎn)

這樣處理的話，是不是高效多了。當(dāng)然在第一遍的時(shí)候，還是需要遍歷整個(gè)數(shù)組和數(shù)組下的桶中接點(diǎn)。不過這樣操作之后，在刪除一個(gè)結(jié)點(diǎn)的時(shí)候，就需要做一些操作。刪除一個(gè)接點(diǎn)時(shí)，需要考慮當(dāng)前的h->node是不是當(dāng)前被刪除的接點(diǎn)，如果是，就把h->node稱至下一個(gè)接點(diǎn)。就是刪除之后，要作如下處理，假如刪除了。

假如被刪除的接點(diǎn)為node,需要如下處理：

    if(h->node == n)
        hashtable_iter_next(h);

將h->node移動(dòng)到下一個(gè)接點(diǎn)。

本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊一鍵舉報(bào)。

小男孩‘自慰网亚洲一区二区,亚洲一级在线播放毛片,亚洲中文字幕av每天更新,黄aⅴ永久免费无码,91成人午夜在线精品,色网站免费在线观看,亚洲欧洲wwwww在线观看

C語言的HashTable簡(jiǎn)單實(shí)現(xiàn)