Identifying Suspicious Activity in the Community Hub

Identifying Suspicious Activity in the Community Hub

What is considered suspicious activity?

  1. Manually entering or scripting a large number of requests for fake accounts to leverage rewards.
    1. Using a masking verification code platform and a number pool to receive the verification code for bulk registration.
    2. Directly calling the API interface to register
    3. Multiple user registration information (including IP, UserAgent, page tracking, device fingerprint, etc) is aggregated under the same invitation code.
    4. Typical features of suspicious activity:
      1. A high volume of registrations in a short time interval (more advanced scripts can add custom time variations)
      2. The mobile phone number is identified as having a high probability of being multiplexed (when multiple digital signals are combined into one signal over a shared medium)
      3. Missing Front-End identification or UserAgent information
  2. A large number of invitations to “spam” users or invalid invitations
    1. Features:
      1. A large number of similar virtual numbers segment from the inviter, whereas a compliant user’s invitations should bring a large number of irregular mobile phone numbers.
      2. A large number of invitees have lower security levels, and there are characteristics that signal group control, such as click farms or other forms of automation.
      3. The same email is registered multiple times (QQ mailbox, NetEase mailbox, etc.) Emails can be randomly generated by a random number generator, or registered by purchasing a mailbox database using leaked passwords in the black market. Overlapping emails in our library occur when a large number of people are using these leaked mailbox databases. 
  3. Using a crowdsourcing platform
    1. Crowdsourcing platform traffic is considered fraudulent traffic i.e., paying users on sites like TaskRabbit to participate in the referral program.
  4. Reports from community members

Anti-Fraudulent Funnel

By exploring various examples of suspicious activities, we have devised our strategy to funnel into five layers. Users who fail to pass the five-layer screening are likely to be identified as suspicious users either by their activities as an inviter or by connection to a large amount of suspicious invitees i.e., if an inviter has a large number of invitees that are identified as bots, then those inviters are banned. Since the funnel strategy is processed via a decision tree, there may be a small number of legitimate users who are erroneously flagged in certain situations.

If you believe you fall into this category, please appeal the decision by filling out this form: https://www.wjx.top/jq/26772008.aspx

Policy diffusion:

Previous strategies identified suspicious users who were using the same invitation code. In the policy diffusion phase, the following operations were undertaken:


Operation 1: Aggregating users that utilize many different invitation codes to observe that invitees passwords or mailboxes are consistent.

The first layer of the funnel: Utilizing a blacklisted SIM card pool of known SIM card dealers and code platforms to filter the referral database. If we discover a user has numbers from one of these click farm platforms in their referral tree, then everyone in their referral tree was flagged and further investigated.

The second layer: Identifying an account that had problems after manual screening and confirmation (customer feedback data + report + crowdsourcing).

The third layer: The average interval of registration time using the same invitation code is suspiciously short according to specified thresholds.

The fourth layer: The user who, after being run through the first 3 layers has a risk level greater than 6, exceeds the safety ratio (see Appendix for the risk level calculation method). They are then placed into the final fifth layer for account investigation.

The fifth layer: For each user, we collected some device information including registered mailbox, registration page behavior, mouse tracking, etc. All of this information creates a unique user fingerprint for each advice. In this 5th layer – we calculated the uniqueness of your invitee’s fingerprints. In a perfect world, each invitee should have a unique fingerprint. I.e., If you inv 100 people, then there will be 100 unique accounts.  In an ideal state, each user’s fingerprint information differs, which creates an “information density” of 1. The higher the fingerprint information density, the higher the likelihood of rule-breaking.

Information density = Σ count (frequency of each unique fingerprint data) ^ 2 / count (number of accounts)

 

Determination of reasonable range

According to the distribution of the data, we’ve identified a reasonable threshold in the interval within the data frequency that decreases rapidly. To ensure that our strategy is loose, it is necessary to select data points where obvious data anomalies are as salient as possible.

Manual verification

Applying the policy funnel may accidentally flag some innocent users or not cover all situations we have outlined, we will continue manually check the filtered data after going through our funnels.

Potential Flags on Legitimate Users

If an account submits an appeal but fails provide valid proof materials, we will check whether there is a large amount of aggregated fingerprint information of the invited users, and filter out for users who meet the following conditions at the same time.


Condition 1: Information density ≤ 1.7
Condition 2: Answer rate ≤ 0.9

Potential for Undiscovered Situations

For certain situations that cannot be fully covered, we filter out users who met one of the following conditions.

Condition 1: Inviter ≥100, the invitee cheat rate ≥0.2
Condition 2: The lower line cheating rate is >0.7, and the information density is ≥10.
Condition 3: The lower line cheating rate ∈ [0.5, 0.7], the information density ≥ 10, the answer correct rate = 1
Condition 4: Risk level ≥ 8, information density ≥ 10
Condition 5: Inviter ≥10, invitee cheating rate >0.7, information density ≥10, unique fingerprint data ≤2

 

 

Leave a Reply

Close Menu