• 周六. 10 月 5th, 2024

5G编程聚合网

5G时代下一个聚合的编程学习网

热门标签

Redis counts user visits

King Wang

1 月 3, 2022

List of articles

        • 1. Use Hash
        • 2. Use Bitset
        • 3. Use probability algorithm

1. Use Hash

Hash as Redis A basic data structure of ,Redis The underlying maintenance is an open hash , Will make a difference key Values mapped to hash table On , If you encounter a keyword conflict , Then a list will be pulled out .

When a user accesses , If the user has logged in , Then we use the user’s id, If the user has not logged in , Then you can also randomly generate a key Used to identify users , When the user accesses , We can use HSET command ,key You can choose URI To piece together with the corresponding date ,field You can use the user’s id Or random identification ,value You can simply set it to 1.

When you want to visit a website for a certain day’s visit , You can use it directly HLEN To get results ;
 Insert picture description here

  • advantage : Simple , Easy to implement . Easy to query , And the data accuracy is very good .
  • shortcoming : Too much memory . With key Increase of , The performance will decrease . Can’t support massive traffic .

2. Use Bitset

For int For the number of type , If it is used to record id, Only one… Can be recorded , And if it’s converted to binary storage , Can represent 32 individual , The utilization of space has increased 32 times . For massive data processing , This way of storage will save a lot of memory space . For users who are not logged in , have access to Hash Algorithm , Hash the corresponding user ID into a number id. For 100 million data , We just need 1000000000/8/1024/1024 about 12M Space around .

and Redis We have SETBIT Methods , It’s very convenient to use , We are item The page can be used continuously SETBIT command , The setup user has visited the page , You can also use GETBIT Method to query whether a user has access to . Finally through BITCOUNT Count the number of visits to this page every day .
 Insert picture description here

  • advantage : Smaller footprint , Easy to query , You can specify to query a user , For non login users , It may be different key Map to the same id, Otherwise, you need to maintain a mapping of non login users , There’s an extra cost .

  • shortcoming : If users are too sparse , It may take up more memory than the first method

3. Use probability algorithm

For a website page if the traffic is very large , If the quantity required is not very high , Consider using probabilistic algorithms . stay Redis in , Have been to HyperLogLog The algorithm is encapsulated , This is a cardinality evaluation algorithm : Do not store specific values , It’s just storing some of the relevant data used to calculate probabilities .
 Insert picture description here
When users visit the website , have access to PFADD command , Set the corresponding command , In the end, we just need to pass PFCOUNT Calculate the final result smoothly , Because this is a probability algorithm , So there may be some errors .

  • advantage : Minimal memory footprint , For one key, It only needs 12kb. For very large-scale data access site efficiency is very high

  • shortcoming : When querying a specific user , There may be errors . It’s not necessarily accurate in the total count .

发表回复