Scams and Fraud Detection in Voice over IP Networks

Final Report Summary - SCAMSTOP (Scams and Fraud Detection in Voice over IP Networks)

Project Context and Objectives:

Project context

Different definitions of fraud are reflected in the literature. However, fraud can simply be seen as any activity that leads to the obtaining of financial advantage or causing of loss by implicit or explicit deception. In traditional telecommunication networks, fraud is already a threat depriving telecom operators from huge amounts of money every year. With the migration from circuit-switched networks to IP-based networks, it is expected that the related situation will be worse. This is mainly due to the lack of strong built-in security mechanisms and the use of open standards in IP-based networks. In fact, the openness, innovative services and low cost structure of voice over IP (VoIP) services has helped VoIP providers to attract large numbers of subscribers over the past few years. These same reasons have unfortunately also attracted attackers and malicious users as well. Based on a 2011 recently published 'Fraud loss survey', the Communication Fraud Control Association (CFCA) reports that telecom fraud costs businesses more than USD 40 billion every year.

Like any telecommunication operator, the VoIP providers are certainly a preferred target of fraudulent persons. Fraud can cause a loss of revenue which is by itself a huge problem in a market based on very tight margins. On the other side, news about successful fraud attacks on a VoIP provider can very easily tarnish the reputation of the provider and would cause a drop in confidence followed by a drop in stock prices and loss of subscribers.

It is worth to mention that fraud detection and intrusion detection have been traditionally completely separate research areas. Fraud detection solutions have been mainly developed by companies to protect their assets and these solutions were usually undisclosed. However, one can find few research works dealing with fraud detection and most of them are based on the use of artificial intelligence.

Intrusion detection is an area where the network is monitored for malicious activities or policy violations. The monitoring output is then reported to a management system. Intrusion detection is an area developed by the research community during the last 20 years at least. However, commercial solutions appeared a bit out of date. In addition to that, these solutions just adopted some intrusion detection simple solutions. As research is the main motivation behind intrusion detection, the related area seems to evolve faster than fraud detection.

As the convergence between telecom networks and the internet is occurring and a lot of telecom services are being replaced by similar services which are IP-based, it is legitimate to ask ourselves whether it still makes sense to keep fraud detection and intrusion detection separate. Taking into account all the aforementioned drawbacks, the main objective of SCAMSTOP is to investigate fraud detection in the light of intrusion detection.

VoIP deployment

Understanding the business model of the VoIP providers is the first step for developing an anti-fraud system that can be successfully deployed by them. For instance, if we intend to develop per user signatures and profiles, we need first to check the VoIP providers' price plans. Just implementing features such as the number of mobile, national or international calls during a given time interval might not help as with the current proposed flat rate options, most of these calls are free of charge and no fraud activity is expected. As an exception here, the VoIP provider might want to monitor the flat rate based service usage to check whether this service is profitable or not.

The VoIP providers do not only have residential as customers but often enterprises. An enterprise can either:

- use an online PBX service where all VoIP services are offered by the VoIP network. The customer uses closed sets of IP phones or SIP gateways (offered by the VoIP provider or one of its distributors). In this offer, the customer does not get any credentials username/password), both the IP phones and SIP gateways are automatically provisioned.
- or use its own PBX (or a different device) to place and receive calls using the VoIP network. The configuration data (server, username, and password) that the customer has to use is usually provided by the VoIP provider when the service starts.

To connect the enterprise PBX to the VoIP provider network, a SIP trunk is used. A SIP trunk is a service offered by the VoIP provider including multiple voice sessions (as many as the enterprise needs) in addition to other features such as instant messaging (IM), presence applications, and data sharing.

Usually, the VoIP services provided to enterprises are charged on a post-paid basis. This, unfortunately, opens doors for potential fraudulent activities.

SCAMSTOP objectives

The SCAMSTOP project aims at protecting VoIP infrastructures by mitigating fraud attempts, thus protecting the VoIP providers against revenues losses and users against theft. This is being achieved by providing a complete solution for the VoIP providers to help them define an efficient fraud detection and management strategy. Based on the above description, the SCAMSTOP project pursues the following major objectives:

- Design and implement a general framework for detecting and protecting VoIP services from fraud and misuse. Within this framework, the different usage scenarios of VoIP are to be considered and the varying needs of the participating small and medium-sized enterprises (SMEs) will be accommodated. This framework will present a general guidance on the deployment of fraud detection systems in VoIP environments in terms of monitoring, detection and alarming facilities.
- Specify and develop innovative and adaptive algorithms for misuse and fraud detection. This will be the core effort in SCAMSTOP. In the design of these algorithms, we will not only aim at achieving a high detection rate but also target a scalable design and low processing and memory resources so as to ensure the applicability of these algorithms in large scale VoIP deployments.
- Support different means for detection that can be dynamically adapted to the user and provider specific needs as well as to the traffic patterns.
- Implementation, integration and testing of the developed tools and solutions in a provider's VoIP infrastructure

SCAMSTOP objectives for the first year

As mentioned in the Technical Annex (page 45), the SCAMSTOP project is supposed to achieve during the reporting period from month 1 to 12, the following technical objectives:

- Investigation of fraud scope: Define the terminology, describe the typology, provide some statistics
- Investigation of fraud in telecommunications: Fraud classification, describe the corresponding typology, investigate fraud in PSTN and mobile networks (scenarios, tools, etc.)
- Investigation of fraud problem in VoIP networks: Comparison with fraud in traditional networks, provide fraud use case scenarios in VoIP, describe tools used for carrying out fraud activities.
- Investigation of fraud detection in VoIP networks: Description and evaluation of the existing techniques, describe the data to be collected for fraud detection and the problems facing the fraud detection in VoIP networks
- Anti-fraud architecture: Identification of the different blocks of the architecture, identification of the algorithms to be used, as well as the interfaces between the different components
- Starting the development of the anti-fraud framework: A basic version of the rule-based fraud detection will be provided
- Design and development of risk assessment methodology for the SCAMSTOP architecture

SCAMSTOP objectives for the second year

The SCAMSTOP project aimed at achieving the following objectives during the time period from month 13 to 24:

- Explore the available components for realising some of the needed functionalities such as SIP servers and clients, data mining frameworks, and system experts, and investigate their suitability for the project
- Explore the call detail records (CDRs) fields that can be used for developing the detection algorithms
- Design and development of tools to generate fraud-based CDR
- Develop various algorithms for VoIP fraud detection. These include: Bayesian networks, neural networks self-organising maps (NN-SOM), nearest neighbour algorithm, balanced iterative reducing and clustering using hierarchies (BIRCH), signature based technique, and rule based technique
- Specify and develop an event-based system that will coordinate the activities of the different detection components
- Specify and develop a friendly web interface that allows the fraud expert to configure the different algorithms and visualise the results
- Prepare CDRs data for testing purposes. Here, some SIP client needs to be modified to produce synthetised data. Real life data offered by the VoIP providers need also to be transformed to a format that can be used by the algorithms
- Test the different algorithms separately
- Test the entire framework
- Integrate the SCAMSTOP framework with the VoIP providers' infrastructures, test it and provide feedback to the RTD performers
- Update on the risk assessment methodology for SCAMSTOP architecture

Project results:

This part briefly describes the main achievements of the SCAMSTOP project.

Problems addressed

The classification of fraud can be achieved in different ways according to the point of view from which the related activities are observed. However, the categorisation that is generally cited in the literature is the following:

- Subscription fraud: This occurs from obtaining an account or service, often with false identity details, without the intention of paying. The account is usually used for call selling or intensive self-usage.
- Superimposed fraud: A fraud activity is said to be superimposed when a fraudster illegally gets resources from legitimate users by gaining access to their phone accounts. This kind of fraud can be detected by the appearance of unknown calls on the bill of the compromised account. Scenarios describing this kind of fraud include: mobile phone cloning, breaking into a PBX system, etc.

Although the SCAMSTOP project addresses both categories, we also would like to attract the attention to a fraud scenario that can be considered as part of the subscription fraud; however, it is rarely mentioned in the literature. This scenario is the activity where the service usage does not match the subscription type. For instance, some customers can subscribe for a residential service which is usually cheaper than a business one and use it for business purposes. Another case is where the customer subscribes for the option that allows it to use its own PBX, and then use this PBX as a dialer for call centre purposes. On the infrastructure side, this service abuse looks like a denial of service (DoS) attack, affects the VoIP provider's network and reduces its capacity. The badness of this situation depends on the capacity of the SIP trunk and how often the related operation is repeated. Another scenario that does not in any case match the subscription type is the use of the provider infrastructure to build some kind of subscriber database that can be sold to marketing companies. For instance, there are customers that look for operational mobile accounts by trying to connect to them without establishing the calls. Based on the provisioning response messages, they determine whether the targeted mobile accounts are operational or not.

Unfortunately, the VoIP provider cannot a priori be aware of the device that is installed in the customer's premises and for which purpose it is being used. In addition to that, the VoIP provider that has hundreds of thousands of customers cannot easily check the installations related to all these accounts.

Achievements

The SCAMSTOP project delivers a platform for VoIP fraud detection and management. Within this project, a bunch of components were developed and tested. These components belong to three main levels:

- Detection level: Here, several standalone modules were developed to detect fraud activities. This particularly includes a rule system and clustering algorithms.
- Management level: Here, a rich management interface allowing alarms and results visualisation, rules creation, and algorithms configuration was developed.
- Coordination level: Here, an event-based system for correlating the activities of the detection modules.

In the following paragraphs, we explain in more details each of the mentioned levels.

Detection level

1. The rule system

The rule-based approach defines fraud patterns as rules. The rules might consist of one or more conditions. If all the conditions are met, an alarm is produced. The rules can simply be applied to the call data records (CDRs). In this section, we briefly discuss the main features of the rule system. For more details, we refer to the deliverables D2.2 D3.1 and D4.1.

For creating composed rules, we undertook the following tasks:

- definition of the features needed to be used for creating the rules (IP address, daily quarter, call type, time window size, etc.),
- development of a tool for checking the syntax of the rules,
- identification and storage of the rules (validation time, criteria to fetch them),
- rules enforcement.

For the alarms generation part, we undertook the following tasks:

- definition of the alarms format and content,
- development of an interface for alarms visualisation.

Implementation

The rule-based system is intended to be used for detecting users' abnormal behaviour. For this purpose, thresholds are used, to verify whether a certain feature (e.g. the call duration) exceeds a given threshold. In our implementation, it is possible to apply the rules on a single or a set of CDRs. To apply a rule on a collection of CDRs, a mechanism, similar to 'counters', was implemented. The counter functionality was designed to provide as much flexibility as possible, supporting on the fly creation and also sliding window (here the counter can be the size of the window). Through a sliding window, it is possible to define rules such as:

'If a user A makes more calls in a given time slot than a given threshold, raise an alarm'.

In the beginning, existing rule-based systems as CLIPS1 and JESS2 have been studied in order to use them as a basis for the rule engine. However, we found out that these systems are very complex and cannot be easily enhanced with mechanisms such as sliding windows. Therefore, we decided to develop our own rule-based system. For simplicity reasons and for easier integration within the event system module, Python was used as a programming language.

Rule language

In the SCAMSTOP project, a simple language to define the rules was introduced. A readable rule is typically created as a sequence of sub-rules. These sub-rules will be checked sequentially. Each time a new CDR comes, the rule is applied. A sub-rule consists of two parts: the condition and the action. The action is only performed if the condition is evaluated as true. The action contains information about either an alarm that should be generated or a counter that needs to be created or a counter value that needs to be increased. A condition can also contain binary and unary operators and sub-conditions. After the parsing, the conditions are stored in a tree data structure. The action is stored in a list data structure sequentially built on the input order. To handle the monitored CDRs, an identifier is used for each feature of the CDR data that can be built into a condition.

The counter mechanism is very important for the rule-based system. It is used for measuring, how often an event occurred in the CDRs within a given time window. In our context, each counter has an identifier. Counters can be created and deleted. A counter is usually initially created with a 'threshold' value and additionally (if needed) the length of the counters availability (used for the sliding window). If no time for the counter availability is set, the counter will be available in the rule-based system as long as it runs. Functions to retrieve the current counter value and increase the counter have been implemented. If a counter reaches a value above the initially set threshold, an alarm will be generated via the event system. Their necessity will be seen in the upcoming examples.

One important issue with the sliding window in counters refers to the time stamp. The rule-based system does not generate its own time stamp. All time information will be extracted from the input data of the CDRs. This behaviour enables executing the rules over the CDRs in a near real-time manner.

From the web interface to the rule system

The user defines a new ruleset through the web interface. This can be achieved in a user friendly way, since the user does not need to specify each sub-rule and can use predefined functions. The web interface will generate the necessary sub-rules out of the better readable input. The web interface also saves the new rules in a database and sends the 'IDs' of the rules to the event system with a special event type. The event server forwards this event to the rule system. The rule system reads the new rules from the database and applies them.

Example

The following rules set refers to the example 'If a user generates more than 10 premium calls within 1 hour, twice a week, raise an alarm.'

- 'is_premium & !have_counter('$src_id premium')', 'init_counter('$src_id premium' 3600)'
- 'is_premium == 1', 'count('$src_id premium' 1 timestamp)'
- '((get_counter('$src_id premium') > 10) & !have_counter('$src_id premium 2 times')) & is_premium', 'init_counter('$src_id premium 2 times' 604800)'
- 'get_counter('$src_id premium') > 10' , 'count('$src_id premium 2 times' 1 timestamp)'
- '(get_counter('$src_id premium') > 10) & (is_premium == 1)', 'del_counter('$src_id premium')'
= '(get_counter('$src_id premium 2 times') >= 2) & (is_premium == 1)', 'alarm('User $src_id make 2 or more times 10 premium calls in a week') del_counter('$src_id premium 2 times')'

Additional information can be found in:

- SCAMSTOP deliverable D2.2
- SCAMSTOP deliverable D3.1.

2. The signature based technique

Unsupervised techniques can simply be used in the case where we are not certain about which transactions in the database are fraudulent and which are legal. These techniques are in particular based on what is called 'profile / signature' or 'normal behaviour'. Here, the past behaviour of the user is cumulated in order to build a profile that will be utilised to predict the user's future behaviour. As this profile describes the habitual service usage pattern of the user (called 'normal behaviour'), any significant deviation from this profile has to be reported because it might hide some fraudulent activities. A signature can be seen as a statistical description or a set of features that captures the typical behaviour of the user, namely, the total number of calls, number of calls to international / premium / mobile destinations, duration of the calls. Unfortunately, the current use of this method in telecom fraud detection does not take into account several aspects including the business plans of the VoIP providers. Moreover, the evaluation of such approaches is not an easy task with the absence of enough details. On the other side, we believe that such approach leads to problems related to the performance especially if the signature gets bigger. We also believe that using global metrics for comparison (such as the Hellinger) leads to the loss of information about individual features. This means it will be difficult to know which feature has yield to the occurred misbehaviour.

When computing the signature, we also need to deal with data fluctuation in the service usage that varies from one day to another as well as the periods of inactivity in which the subscriber did not use the service. In the literature, these issues are not discussed; however, they are being addressed in the context of our work. It is also worth to mention that our solution also investigates how the initialisation and the update of the signatures can be achieved based on the related specification.

Our approach is as follows:

- For each user, we build a short-term (daily basis) signature based on features such as: number of calls to premium / international / mobile destinations as well as their durations.
- Reduce data fluctuation by dividing the day into four time periods: morning, afternoon, evening, and night.
- Remove the inactivity related information. This information is reflected by the 'null' value for instance for one of the signature features. Image the case where a user did not make any premium call in the afternoon. Keeping the 'null' values will affect the calculation of the mean and does not bring any valuable information regarding fraud detection.
- Integrate the short term signatures into a long term signature (on a monthly basis) using the trimmed mean and the related standard deviation.
- Before we compute the trimmed mean, we transform first our data distribution to a normal one using the logarithm or the square root function.
- To check for misbehaviour, we compare the long-term and short-term signatures on a feature basis using the z-score technique
- The update of the long term signature is straightforward due to the way the signature is defined.

The use of z-score for each feature of the signature directly gives the impact of this feature on the entire signature. We have also shown that the z-score and the Hellinger distance are linked to each other which give advantage to the first one as it is easier to manipulate. In addition to that, and contrary to the solutions already proposed in the literature, we use different appropriate profiles instead of a complex one as it is easier to manipulate them separately.

Initialisation

New users might also be fraudsters. This means such users have to be monitored right after they start using the service. To build a reliable profile for a user, we need to observe him for some time (a week, a month, etc.). However, this cannot be applied for users that just sign up for the service. A typical behaviour of a subscription fraud is the excessive usage of an account or a subscription in a very short time interval which enables him to escape the detection. It turns out that the signature initialisation is an important step in detecting and preventing subscription fraud. Signature initialisation is a challenging task due to the very limited data about the new subscribers.

Dealing with a new subscriber in our context relies on the signature technique that we have already discussed. In the signatures, the user service usage is mainly described by the mean and the related standard deviation. In addition to that, the signature is updated on a per day basis. As a consequence, a new subscriber can be observed during two days, if no fraudulent activity is met and required an interruption of the service, this new user will be assigned a signature in the way discussed earlier. This is possible due to the fact that the mean and the standard deviation for each feature can be computed over two days but not over one day.

Observing the new subscriber for the first two days can be achieved through looking particularly whether this subscriber:

- has made at least one international call (or a premium call) with a duration greater than a predefined threshold,
- or made a call to a destination from the black list.

We assume here that the VoIP provider maintains a list with phone numbers or accounts IDs for subscribers that were committing fraud. In this case, this subscriber will be labelled as suspicious and will be monitored closely during the mentioned two days.

Our approach was tested on known and unknown fraud cases as it will be discussed in the testing part. A paper related to the obtained results was submitted to the SIGCOMM 2012 conference.

Additional information can be found in:

- SCAMSTOP deliverable D2.2
- SCAMSTOP deliverable D3.2
- SCAMSTOP deliverable D4.1.

3. The balanced iterative reducing and clustering using hierarchies (BIRCH) algorithm

Clustering is used to arrange a set of n users into groups, referred to also as profiles / signatures - see previous section -, such that each group consists of users whose call patterns have similar characteristics. This essentially requires clustering the m feature vectors of the n users that represent their long-term signatures.

Before looking in more detail at how clustering was used in the SCAMSTOP project, two important points are worth mentioning. The first point concerns the fundamental purpose behind clustering the feature vectors and the second the expected result of the clustering process. Specifically, in this work, clustering is used to automatically infer knowledge about the existence, number and nature of intrinsic groups in the analysed feature vectors. In more detail, it is expected that the feature vectors fall into a few compact and well separated clusters and that these clusters are pure in the sense that some comprises only feature vectors of regular subscribers and the others only feature vectors of fraudulent users. The implication of this result is that it is feasible to reliably distinguish between regular and fraudulent users by analysing their call data records. This makes it possible to automatically detect an end-user that exhibit a fraudulent behaviour.

BIRCH is a hierarchical clustering algorithm designed to perform hierarchical clustering over particularly large data-sets. An advantage of BIRCH is its ability to incrementally and dynamically cluster incoming, multi-dimensional metric data points in an attempt to produce the best quality clustering for a given set of resources (memory and time constraints). In most cases, BIRCH only requires a single scan of the feature vectors. The principle behind this is that in BIRCH each clustering decision is made without scanning all data points and currently existing clusters. In more detail, BIRCH exploits the observation that data space is not usually uniformly occupied and not every data point is equally important. Thereby, it makes full use of the available memory to derive the finest possible partition while minimising I/O costs. Furthermore, BIRCH is an incremental method that does not require the whole data set in advance. In addition, BIRCH is recognised as the first clustering algorithm proposed in the database literature to handle noise (i.e. data points that are not part of the underlying pattern) in an effective way.

Since clustering finds groups of feature vectors that are not known a priori, regardless of the clustering algorithm used, the clustering results need some kind of validation. In particular, the hierarchical clustering algorithms require an a posteriori decision with respect to the partition that best satisfies or reproduces the underlying structure of the feature vectors. Put it differently, inferring valuable knowledge from the hierarchy of partitions that these algorithms produce, involves using a criterion that can determine the optimal number of clusters. Over the past years, a large number of criteria for specifying the hierarchical level on which to base inferences concerning the true differences between feature vectors have been proposed. These criteria, known as stopping rules, evaluate for a given hierarchical clustering algorithm and set of feature vectors the partition of each hierarchical level by comparing it to the partitions of every other level. Although significant effort has been devoted to stopping rules, there exist no general gold standards that are capable of revealing the optimal number of clusters over sets of data objects from diverse application fields. Hence, in the context of SCAMSTOP, three well-known and widely-used stopping rules are employed in a complementary way: the dunn, the silhouette width and the davies-bouldin indices.

Concretely, the BIRCH algorithm was implemented and used in the context of SCAMSTOP in the following way,

- read CDRs from the CDRs database,
- separate the read CDRs on a per user basis,
- create a signature for every user following the method discussed previously,
- find suitable features to distinguish between users. This is also based on the signatures' features discussed earlier,
- apply clustering to the signatures,
- find the best clustering structure. The best number of clusters is not known a priori. The BIRCH algorithm provides such information,
- deal with time consumption and error prone process. The stopping rules can be used here,
- use heuristics to label the clusters,
- find which clusters reflect the fraudsters.

The BIRCH clustering algorithm was also tested. The algorithm arranges data points (vectors in a multidimensional space) into a tree-like hierarchy of clusters. The vectors are being grouped depending on a particular distance metric that defines the radius of a cluster and the distance between two clusters. Then, we select one appropriate level of clusters that we mark as ordinary (big clusters) or suspicious (small clusters). The small clusters contain the points that represent unusual user activity, and many of such users may be fraudsters or call centres. The big clusters represent more common patterns of user behaviour.

The data used for testing was provided by VozTelecom. It consists of 4825 accounts activities over a 3 months period of time with almost 31 million CDRs. We assumed in our experiments that the users' accounts existed throughout the whole testing period, so for each of the 90 days 4825 short user signatures may be produced. Every short signature contains 20 numerical characteristics and is a point in a 20-dimensional vector space. The signatures for a given day are considered as data points and clustered through the following steps:

1. The data points are being sorted by 'total number of calls' in the ascending order to avoid the skewness, because BIRCH may be sensitive to the order of the input.
2. The coordinates of the points are being scaled according to their standard deviations, because the spread of different characteristics is various. The scaled coordinates have the same order of magnitude.
3. The scaled points are being clustered with BIRCH, using the Euclidean distance metric, cluster radius metric, inter-cluster distance metric, the branching factors B and L, and the cluster threshold T (we refer to the deliverable D3.1 for more details).
4. Small clusters are being marked as suspicious, the list of users whose activity data belong to these clusters, is being compared against the test accounts provided by the VoIP provider.
5. True / false positive / negative rates are computed.

Additional information can be found in:

- SCAMSTOP deliverable D3.2
- SCAMSTOP deliverable D4.1.

4. The nearest neighbour algorithm

Nearest neighbour (also known as collaborative filtering or instance-based learning) is a useful data mining technique that allows to use past data instances, with known output values, to predict an unknown output value of a new data instance. Nearest neighbour is very successful in situations where neither regression nor classification can be applied. First, regression can only be used for numerical outputs. Classification is a serious problem with some data where the number of classes is large and previously unknown (e.g. hundred thousand products of a company). Since the behavioural groups of users can be numerous and not determined a priori, we propose a suitable solution based on nearest neighbours.

The K-nearest neighbor (KNN) algorithm is one of the oldest algorithms in machine learning. Given a set of training data and a tuple X, the KNN algorithm searches the k nearest neighbours to X based on a distance measure. Our idea was to use the KNN as a clustering module in SCAMSTOP. This kind of clustering can be considered as lazy since no general model is built until a given sample needs to be analysed. In this sense, every subscriber has its own cluster. This solution gets rid of the difficulties of dynamic clustering such as the operations of cluster maintenance and cluster identification in case of splits and merges. This choice reveals also to be efficient if only one or a subset of subscribers need to be analysed.

The idea is the following: for each subscriber, we monitor the K users having the closest behaviour during a given time window in terms of a number of defined features. These users are called the neighbours of the subscriber in question. We update the list of neighbours after each analyzing window using a bumping algorithm. In fact new neighbours appear and old neighbours may become irrelevant. For each neighbour we track two variables: the frequency (how many times it has been chosen in the top-K neighbours) and the recency (when was the 17 last time it has been seen). The updating is done periodically during all the training period. The bumping algorithm works as follows:

Input:

- L1: list of old neighbours with their frequencies and recencies
- L2: list of new neighbours

Body:

For each neighbor in L2:
If neighbor exists in L1:
Update the frequency and the recency of the neighbour

For each neighbor in L2:
If neighbor doesn't exist in L1:
Draw a random number in [0,1]
If random number > threshold (we choose 0.7 in our case):
Choose a candidate from the L1
'The candidate should have low frequency
and should not be seen recently'
Replace the candidate by the new neighbour

Output:

- L3: List of updated neighbours with their frequencies and recencies

For the time periods where some subscribers are absent and do not issue any call activities, the lists of neighbours for these subscribers are not updated. In this way, we compensate the effect of some contexts where a lot of subscribers go idle and it is difficult to know the real neighbours (example, weekends or holidays for professional accounts).

In the testing period, we ask the question 'is there a sudden change in the neighbours of a given subscriber?'. To answer this question we need to compare the neighbours discovered during the testing period with the updated history of the subscriber. This comparison is quantified using a score (s). The latter is increased for each training neighbour that is found in the testing neighbours, and it is decreased for each training neighbour that is not found in the testing neighbours. The amount of increase or decrease depends on the frequency and the recency of the neighbour in question. In result, subscribers with large negative values are revealed by our algorithm as suspected. It means that some neighbours that have been seen frequently and recently in the past are not seen any more in the testing period.

As for the list of features used for the computation of the KNN, one can mention: number of calls issued during the analysed window, average call duration for successful calls, ratio of call success, ratio of calls towards international destinations and information entropy of the different source IPs of the caller.

For the implementation, we have used the instance based learner IBK from the WEKA library. IBK is a classifier based on the KNN algorithm. In our case, we use it only to calculate the nearest neighbours since our problem is not about classification. The used search algorithm is linear (the brute force search) and the used distance is Euclidean.

Additional information can be found in:

- SCAMSTOP deliverable D3.2
- SCAMSTOP deliverable D4.1.

5. The neural network self-organising map (NN-SOM)

The idea behind using NN-SOM as a fraud detection module in SCAMSTOP was based on the assumption that the behaviour of an individual subscriber cannot suddenly change. Thus if historical behaviour patterns were used during the training of the neural network and the current behaviour pattern is given as input, then the current pattern should be classified as a known pattern. If this will not happen then we should consider the subscriber behaviour as a possible fraud. In this sense, every subscriber has to have its own neural network.

On the other hand, special temporal situations may happen, like an emergency situation such as an earthquake or a national holiday on which it is usual for subscribers to change their calling patterns. In such situations a change in 'normal' behaviour should not be considered a fraud. To overcome this problem, we partition the subscribers into groups and for each group another NN-SOM network is created using the aggregated records of all users in a group as training data. Similar to the KNN algorithm, we do not have to create perfect partitions of the subscribers. The subscriber can participate to one or more groups that were having the closest behaviour during a given time window. This can be accomplished by comparing the output of the individual NN-SOM of the subscriber with input from the same period of the other group participants or use the clustering of the subscribers of the KNN algorithm. When we observe a possible fraud detection on an individual subscriber the input record of the subscriber is also fed as input to the NN-SOM of the group and the other individual NN-SOM of the subscribers belonging to group are activated and are fed the corresponding users data. If the group behaviour has also changed, this is an indication that we may not have a fraud situation. If the other members of the group have not changed their behaviour, then the fraud indication is stronger.

The implementation of the algorithm is based on the Encog neural network framework (see http://www.heatonresearch.com/encog for details). For each subscriber, a lightweight NN-SOM is created having 9 input neurons and 50 output neurons. The input record represents aggregated values of call data for each subscriber for a predefined period of time (e.g. 60 min). More specifically, the input record comprises in particular the following fields: quarter of the day, working day or not, number of calls, average duration of calls, percentage of international calls and percentage of premium calls.

A high level pseudo code for the training phase is as follows:

Input:

L1: List of CDR
S: List of subscribers
T: Training period
Tp: Time partition step

Body:

For each s in S
For each Tp in T
Extract input record (s ,Tp ) from L1
Update NN-SOM (s)
Persist NN-SOM(s) in database

Output:

Forall s in S exists NN-SOM(s) in database

A high level pseudo code for the fraud detection phase is presented below,

Input:

L1: List of CDR of subscriber s
Tp: last Time partition step

Body:

Extract input record(s ,Tp ) from L1
Fetch NN-SOM(s) from database
Feed input record(s ,Tp ) in NN-SOM(s)
If (Fraud_detected) by NN-SOM(s)
Fetch NN-SOM(group(s)) from database
Feed input record(s ,Tp ) in NN-SOM(group(s))
Foreach s' in group(s)
Fetch NN-SOM(s') from database
Fetch L2: List of CDR of subscriber s' for Tp
Extract input record(s' ,Tp ) from L2
Feed input record(s' ,Tp ) in NN-SOM(s')
Get possible fraud percentage from NN-SOM(s')
Generate ballot for group special situation
Decide if special situation exist
if (no special situation)
Generate fraud alarm
else // We have a special situation in all group members
Update NN-SOM (s)
Persist NN-SOM(s) in database
else
Update NN-SOM(s)
Persist NN-SOM(s) in database

Output:

1. an updated NN-SOM(s)
2. an optional fraud alarm.

The NN-SOM was tested using data provided by VozTelecom. In a modern server with sufficient memory, the initial training phase for the neural network for each subscriber required 0.44 - 1.04 sec, with an average training time of 0.6 sec. The initial training phase is required only once, although incremental training of the neural networks is constantly required in order to keep the neural networks

Additional information can be found in:

- SCAMSTOP deliverable D3.2
- SCAMSTOP deliverable D4.1.

6. The probabilistic approach

The probabilistic approach generates the user profile through the elaboration of the CDRs. This user profile includes for every type of user his origin, the destination and the average duration. After the training time, the system can put a flag on the user with his profile. All the user profiles populate a user profile database, which keeps all the user profiles for legitimate and non-legitimate users. The non-legitimate users have specific characteristics that deviate from the legitimate (e.g. calling number selection towards unusual destination, calling duration deviates from the average call duration).

Both user types are stored in to the user profile database. The user profile database communicates with the policy controller. The policy controller monitors the user profile database, and makes decisions, based on user profiles. If a user belongs to a certain profile and a number of his calls deviate from that, then the policy controller decides to communicate with the active or passive components of the system. The policy controller may include both passive and active components.

In this approach, we calculate all the conditional probabilities between our different variables and store them in a table. Such condition probabilities include the calculation of probability of odd destination country (i.e. Africa) given that the time is morning, the calling number is fixed and the call duration exceeds a threshold. The aim of this probabilistic approach is to generate alarms for the following scenarios:

- Scenario 1: Increased number of calls to a certain African country that have short duration and spread throughout the day.
- Scenario 2: Increased number of calls to a non-European destination that have long duration and are made during the morning.

Input: Read from the CDR the following information

- source specific number (SN)
- source area code (or mobile company code) (SA)
- destination specific number (DN)
- destination country code (DC)
- destination area code (or mobile company code) (DA)
- time of day classification: (CT)

i. night (00:01-08:00)
ii. morning (08:01-16:00)
iii. evening (16:01-12:00)

- weekday / weekend classification (CD)
- duration classification (DU)

i. short (0-90 sec)
ii. medium (90-240 sec)
iii. long (240 or more sec)

- Call type (TY):

i. local, long-distance, mobile or other

Confidence interval:

Generate the following condition probabilities:
P(DC, CD, CT, DU, TY)=P(DC)P(DU/DC)P(CD/DU)P(TY/DU)P(CT/TY)
where DC: destination country, CD: call day, CT: call type, DU: duration, TY: call time,

Input CDR under tests

Output
Number of false alarms, number of true alarms

The probabilistic algorithm was tested using one-month data provided by the Greek VoIP provider ViVA where all the above data were provided without any anonymisation process. These data were used in order to generate synthetic fraudulent CDR data using the SIPp tool.

Additional information can be found in:

- SCAMSTOP deliverable D3.2
- SCAMSTOP deliverable D4.1.

Management level

The aim of the SCAMSTOP management framework is to help the VoIP provider to administrate the SCAMSTOP platform. Through a friendly user interface, the administrator can:

- visualise the results of the detection.
- visualise the alarms generated.
- search for a given user and visualise his activities.
- filter activities.
- create and apply rules.
- configure and schedule the detection.

In the context of SCAMSTOP, a web interface based on Django (see http://www.djangoproject.com/ for details) which is a high level Python web framework that encourages rapid development and clean, pragmatic design was developed. In this section, we will just list the functionalities that this interface supports.

The management interface allows to list all the users or to look for a certain user if we have his identifier. The identifiers we are currently using are the ones provided by the VoIP partners. Once, a user is selected, different frames showing the user's activities are available. As we intended a friendly fraud management interface, we decided to incorporate charts to it.

This makes the results interpretation easier especially that the signatures include different features whose values might be difficult to analyse and to explain without the support of graphs. Pie charts and histograms are used to describe the evolution of the user activities during the signature period of time. The charts we incorporated are based on the library (see http://www.highcharts.com/demo/ online).

IP addresses information can support the fraud detection operation. For a user who has a subscription with a given Internet provider, the IP addresses assigned to him should present some kind of correlation. As a consequence, looking for the IP addresses (and their frequency) used during the signature period of time is important as this can allow to find out some potential fraud indicators.

Additional information can be found in:

- SCAMSTOP deliverable D3.2
- SCAMSTOP deliverable D4.1
- Guide for the management interface.

Coordination level

The SCAMSTOP framework has a modular architecture where each module represents a fraud detection algorithm. This framework is designed in a way permitting incorporation of additional detection, correlation, analysis, and notification tools.

A legitimate question that might be raised is: how should these techniques be integrated together to provide a better detection rate. To answer to this question, we need to notice that launching the detection algorithms in parallel or in a sequence (one after the other) does not really make sense as there are algorithms that need to be scheduled over sufficiently large time intervals to be able to operate. The signature-based technique is a particular case of such algorithms. In contrast to this, a rule-based technique can be launched on demand. Indeed, the rule engine can be configured to apply a given rule on any new call (or CDR) that comes during the night to some suspected destinations. In addition to that, an alarm can be sent (in urgent cases) by email or by another means to the fraud management expert.

For these reasons, we decided to implement the SCAMSTOP framework in an event-based manner. This means the different components communicate by generating and receiving notifications. An event reflects the occurrence of an item of interest to some of the system components, for instance the arrival of a new CDR or the creation of a new rule. The event-based architecture is well suited for large scale distributed applications and provides easy integration of autonomous and heterogeneous components into a complex system.

The event system builds a simple star-topology with 'a server' in the middle surrounded by clients. Each client talks only to the server. Clients can talk to each other; however, this is possible only through the server.

Our event system is based on XMPP. The latter refers to 'extensible messaging and presence protocol', which is an open technology for real-time communication, powering a bunch of applications including instant messaging, presence, and voice and video calls. XMPP is open, standard, secure, extensible and flexible. In addition, XMPP is XML based and being used by various companies in particular: Jabber, Google, Apple, Facebook, and Skype.

The extensible messaging and presence protocol provides the following features:

- XMPP message: Request / response (e.g. server push), point-to-point, broadcast (e.g. publish / subscribe)
- Many extensions: XMPP over HTTP (XEP-0124), service discovery (XEP-0030), user location (XEP-0080), jingle (XEP-0166)
- Many implementations:

i. server: Jabber, ejabberd, Apache vysper, Openfire etc.
ii. client: Jetsi, Empathy, iChat, JWChat etc.
iii. libraries: Python (pyxmpp, xmppy), C++ (txmpp), Java (Smack), PHP (Jaxl).

In our implementation of the event system, we used the following libraries:

Client library:

- Python: xmpppy (see http://xmpppy.sourceforge.net/ for details). This client is integrated to the web interface to inform the XMPP server that a new rule was created.
- C: txmpp (see http://github.com/silas/txmpp for details). This client is integrated for instance to the CDRs database to inform the event system that a new CDR came in. It is also integrated to the rule engine to subscribe to the event system for notification about events such as 'a new rule was created'.

Server:

- jabberd2 (see http://codex.xiaoka.com/wiki/jabberd2:start for details).

To inform the rule system about a new CDR or a new rule, the usual XMPP chat-frame was modified, such that every XMPP-frame sent contains a XMPP-type field containing one of the following:

- 'rule_id'
- 'cdr_id'

The body of the frame will contain the individual id, either of the new rule or the new CDR. The rule system (initiated by its listening client) executes the following steps:

1. evaluate the type field of the incoming XMPP message,
2. a) if type is related to a new rule: apply the new rule to all CDR entries in the database, or b) If type is related to a new CDR: apply all 'active' marked rules in the database to the CDRs.

In order to avoid message communication overhead for every single new CDR, a single cdr_id is only sent to identify the last cdr_id of a collection of new CDRs. This means, if we inject a collection (imagine 38 million CDRs) of new CDRs at a point of time in the database, the XMPP client will not send 38 million single XMPP messages. After injecting the CDRs only one message is send, identifying the last CDR. The rule system that stores (and therefore knows) the cdr_id before the last injected one will start applying the active rules only to the new CDRs until the last received cdr_id (no computation overhead) in an increasing sequential CDR order. For this scheme, it is necessary to auto-increment all the database cdr_ids.

Beyond this functionality, receiving XMPP messages and taking them as starting points for any execution makes it necessary to use a scheduler. Imagine a realistic scenario with an execution time of 30 minutes (to apply a rule to all CDRs) where 4 new incoming rules shall be applied to the CDRs. Therefore, a scheduler similar to the concept of a monitor from operating systems was implemented. The execution is handled as a job. Only one job is allowed to run at a time to have a consistent database. If four new rules are sent via the XMPP client, the required jobs to apply the rules are added to a queue for later execution. If a job has finished, the next job from the queue is executed.

Additional information can be found in:

- SCAMSTOP deliverable D3.2
- SCAMSTOP deliverable D4.1.

Testing activities

1. Data used for testing

Real life data

For testing the different modules, both real and synthetised CDRs data were used. The real data reflects three months of CDRs (provided by VozTelecom) starting from 1 January 2011 to the end of March 2011. The data belongs to 4825 subscribers who generated almost 31 million of CDRs during these 3 months. The CDRs were provided within a tsv file which required the development of a parser module, using parallel processes, to parse the information in the file and store it in some SQL based 'calls' tables within a reasonable period of time. After changing the CDRs data to a format we can use, we started analyzing it. The VoIP provider also offered a small set (17 cases) of compromised accounts with the corresponding activities. The major part of the fraudulent activities reflected suspected persons that:

(1) when using the VoIP service they exceeded a virtual threshold;
(2) or called some unusual destinations.

The data was first analysed and some preliminary discussions with the VoIP provider were triggered. The discussions were mainly to understand some aspects in the data that are related to the rate of successful calls and the rate of calls coming with the state (487: request terminated). Some details were provided in the deliverable D4.1.

What is also interesting is the fact that when analyzing the related activities, we noticed that 70 % of the scam activities took place between 6:00 and 18:00, so during the day (morning and afternoon). This fact is 'strange' due to the fact that usually fraudsters act during evenings and nights to escape detection when carrying out their activities. We also noticed that 57 % of the fraudulent calls were made in the afternoon.

Synthetised data

In order to test the robustness of some of the antifraud algorithms, a tool has been used in order to generate synthesised CDR data. This tool uses as a basis SIP tool that generates SIP calls. The tool has been extended in a way so that a profile can be loaded in order to generate calls and CDR from several users. In SCAMSTOP, these profile characteristics have been retrieved by analysing CDRs from the VoIP providers. These characteristics are the following:

- PDF of inter-arrival time: Poisson, Pareto, Weibull, Gaussian distributions have been considered. By analysing CDRs, it has been found that mean interarrival time during rush hours is shorter (08:00 - 16:00) than that of the rest of the day (00:01 - 07:59 and 16:01 - 12:00).
- Service time follows exponential distribution.
- Probability type of outgoing call: Outgoing calls fall in the following categories: local, national, mobile, international (within European Union (EU), United States of America (USA), other) calls. Probability mass of function of outgoing calls towards specific destination (e.g. Asia) is smaller as compared to that of a local / national call.

By modifying either one of the following parameters, a fraud scenario can be defined:

- Increasing the number of calls within a certain period (either rush hour or rest of the day); interarrival time is decreased
- Probability of outgoing call towards a specific destination (e.g. USA, Asia) is increased.

Some screenshots of the extensions made to the SIPp tool were inserted in the deliverable D4.1.

2. Use case: How did we test the effectiveness of the signature based technique?

To check the efficiency of the developed algorithm, we used two samples to which we matched the results of the detection.

Known fraudulent cases

These are the 17 cases mentioned earlier and which the VoIP provider (VozTelecom) has offered.

Unknown fraudulent cases

Call centres activities can also be considered as a kind of fraud in case these activities do not respect the subscription agreement. Unfortunately, VoIP providers do not have any mechanism to deal with such issue. This has pushed us to investigate how the unsupervised techniques could be utilised to detect the related activities.

The first step in this direction was to define some heuristics to classify the provided data to potential call centres and non call centres. The heuristics we have used are related to the total number of calls and the success rate. One could also think about using other filtering criteria such as the number of calls made during nights and weekends. These numbers should be very small as call centres are also enterprises that are often not active during the mentioned periods of times. In fact, this is not always true as the VoIP providers might have business with other countries in other continents which requires from us to take into account the time offset between these countries. To summarise, the heuristics we used for classification are the following:

- The total number of calls made from a given account is greater than 100 per day and it happened at least 3 times during the 3 months.
- The calls success rate is less than 60 %.

These criteria allowed us to generate a list of 119 potential call centres which represent 2.46 % of the total numbers of accounts used in the tests. This sounds reasonable because the scam related data usually occupy a small portion of the entire data. This list was also reviewed by the VoIP provider who confirmed the realisticness of the filtering criteria as well as the subscription accounts figuring on this list.

Performance and effectiveness

A general question that is frequently asked about unsupervised classifications techniques is how to assess the reasonableness of the obtained results and by which standards. One method to do this is to create a list of potential suspicious accounts and ask the VoIP provider to assess each of these accounts in terms of its functionality and the criteria used for filtering. For instance, some accounts are used by the VoIP provider to test the status of the network components, so one should expect a huge amount of calls generated by these accounts with a weak rate of call success as the provider is often interested in whether these components react or not, so no need to establish the calls. This behaviour (huge number of call with weak success rate) is a typical call centre behaviour and the accounts for testing cannot be separated for the potential call centres without the help of the provider.

Unfortunately, this method requires time and resources especially if the created list is long. In our work, we used this method as a first assessment and the results are presented in the previous section. Another method is to compare the signature based technique to other unsupervised or supervised methods. In this project, we also compared our work to the NN-SOM technique discussed earlier.

The testing activities related to NN-SOM are based on a modified software provided by T.E.I. of Mesolonghi. This software is, in its turn, based on the Encog neural network framework.

When applying NN-SOM to the unknown fraud cases, we noticed that 94 cases out of 119 were detected which means a detection rate of 79 %. One can see that the signature based technique is performing better in detecting potential call centres accounts as the detection rate is 95 % if we compare it with the 79 % which is the rate obtained by NN-SOM. However, if we investigate deeper the results of both techniques, we find that 89 potential call centre accounts (from 119) belong to both lists. In spite of the difference between the detection rates, one can note that almost 75 % of the 119 potential call centres accounts were detected by both techniques. This rate is more than reasonable in the context of unsupervised techniques.

3. Example of real life testbed: Integration with VozTelecom's infrastructure

The main objective of the testbed deployed in Voztelecom is to feed the SCAMSTOP framework modules with real life generated CDRs. This testbed is also be used to give feedback to the research partners about the usability of the framework and possible improvements that might need to be done to achieve a good integration with Voztelecom's platform.

To understand where has the SCAMSTOP framework been placed in Voztelecom's platform, we first have to know how Vozelecom does the accounting to their users. The elements that take part in that process of the CDRs generation:

- SIP proxy: This is the core element in Voztelecom's platform. It routes all user's SIP signalling. The proxy sends a Radius accounting request to a Radius server for each one of the following SIP messages:

i. start: 200 and ACK
ii. stop: BYE
iii. failed: error cause.

- Radius server: This component stores proxies's accounting requests in plain text files. A new file is generated each day. Notice that in this file the duration of a call is not directly known, as for a single call we have three records (two starts and one stop).
- CDR parser: This is a program that reads the Radius text files and parses them to load all the records in a DB.
- CDR treat DB: This is a temporary DB where all accounting records for a call are merged in a single CDR. That CDR contains the duration of the call and might be treated to format the CDR with some custom parameters.
- CDR DB: This is the final DB where CDRs are stored.

To feed the SCAMSTOP DB with new CDRs there was the option to create a script that periodically dumped new CDRs from the 'CDR DB' to text files, which could then be manually loaded using Wharf, but a more automatic procedure was chosen.

As Wharf commands have to be run manually and it's not possible to automate the whole detection process, a first approach to the real-life integration, is to create a process in the 'CDR treat DB' so that when it has handled all the information to get one final CDR it feeds both the 'CDR DB' and the 'SCAMSTOP DB' with the needed information and the appropriate format.

The SCAMSTOP database and all the framework modules (detection modules, events, web server) have been installed in a dedicated server in Voztelecom's internal network.

The results obtained from applying the real-life traffic to three modules of the SCAMSTOP framework are described in the deliverable D4.1. The three chosen modules are the rule-based technique, the signature-based technique and the probabilistic approach. The results obtained from the modules correspond to a two-month period data. The tests were performed for two different types of Voztelecom product users:

- Oigaa office: Group of very homogeneous users, with few simultaneous calls, similar call patterns, and pre-provisioned devices with closed configurations (Hosted PBX service). For this group of users, Voztelecom has not had many fraud attacks since it was launched.
- Oigaa direct: Very heterogeneous users with different call patters, different simultaneous calls and with devices configured directly by clients (SIP trunking service). This group of users is the one that has suffered most of the attacks, mostly because of client's SIP agent misconfiguration.

The testing results are also included in the deliverable D4.1.

Additional information can be found in:

- SCAMSTOP deliverable D4.1.

Final Report Summary - SCAMSTOP (Scams and Fraud Detection in Voice over IP Networks)

Related documents

Share this page

Download