## title: Bloom Filter

## Description

A Bloom filter is a data structure which is similar to a set. The Bloom filter allows for the question *Is this item a member of the set?* to be answered quickly. The filter will never return *No* if the item is in the set; but, it may return *Yes* if the item is not in the set. This is one downside to using a Bloom filter, there is a chance for false positive results when checking if an item is in the set. An upside to using a Bloom filter is that adding an item to the set and checking if the item is in the set is a constant time O(n) operation.

## Example

The following example uses a Bloom filter to create a friend list. The example implementation uses three hashing functions. The hashing functions ingest a string (a friend name) and calculate a single value for the string based on the number of spots in the Bloom filter.

Create filter as an array of 10 indices. A `0`

indicates no item is at that index.

`[0,0,0,0,0,0,0,0,0,0]`

A user adds David to the friend list. The string (`'David'`

) is put through several hashing functions which return `0`

, `4`

, and `8`

, respectively. The values from the hashing functions are used to update the filter array at those indices.

The filter indices are updated using those hashed values. A `1`

indicates an item has been added at that index.

`[1,0,0,0,1,0,0,0,1,0]`

A user adds Rosie to the friend list. The hashing functions return `3`

, `4`

, and `6`

for `'Rosie'`

. The filter indices are updated using the hashed values.

`[1,0,0,1,1,0,1,0,1,0]`

There is a check if Chuck is a member of the friend list. The string `'Chuck'`

is put through the hashing functions returning `1`

, `3`

, and `6`

. When the filter array is checked at those indices, it returns `0`

, `1`

, and `1`

. Because one index has a `0`

value, Chuck is *definitely* not a member of the list.

There is a check if Maja is a member of the friend list. The string `'Maja'`

is put through the hashing functions, returning `0`

, `6`

, and `8`

. When the filter array is checked at those indices, it returns `1`

, `1`

, and `1`

. Because all three indices have a value of `1`

, Maja *may* already be a member of the list. This is a false positive result.

## Considerations

Bloom filters allow for quick look up to determine if a value is *possibly* a member of the set or *definitely* not a member of the set. The more items added to the Bloom filter the rate of false positive results when checking if an item is a member of the set will increase. One way to decrease the rate of false positive results is to increase the size of the array. Although this is a compromise because the larger the array, the greater memory it will occupy. It is necessary to determine an acceptable rate of false positive results for a given array size.