CORDIS - Forschungsergebnisse der EU
CORDIS

ONLINE ANALYSIS TOOL FOR THE OPTIMIZATION OF SOCIAL MEDIA CAMPAIGNS

Final Report Summary - OPTIMIZR (ONLINE ANALYSIS TOOL FOR THE OPTIMIZATION OF SOCIAL MEDIA CAMPAIGNS)

Executive Summary:
The Optimizr project’s aim was to develop an innovative analysis tool and Campaign Optimization System (COS) to help SME marketing agencies and their customers (large companies or brands) improve the efficiency of their social media marketing campaigns. The project is based on the combination of social media raw data, semantic analysis, a marketing knowledge base and modeling capabilities that enable the system to provide predicted outcomes from various scenarios and social media marketing strategies.
From a scientific perspective, the Optimizr project combines the use of Social Network Analysis and Information Diffusion methods with text mining capabilities to provide the user with a Decision Support System which draws on a rich knowledge base of previous experiences to aid in the optimization of campaigns and the prediction of campaign impact. Through the effective monitoring and management of the marketing campaign’s impact on social networks and its flow through network user nodes, the SME marketing agencies will be able to identify patterns that will result in better Campaign Optimization for better engagement in social networks and improvement of marketing campaign. There is currently no other tool on the market that offers the scientific features deployed in the the Optimizr prototype. Optimizr will:
1. Inform: Gather information on specific networks and domains in order to produce Information Diffusion Models. These models will allow, in a later stage, the suggestion of actions in order to improve the performance of the campaign executed through the Optimizr system.
2. Recommend: Suggest actions in for executing a campaign. This will be done before executing the campaign and eventually during its execution if the performance achieved does not match the targeted one.
3. Predict: Predict performance of campaigns based on previous cases. It will consider audience/media, domain, content and author.

Project Context and Objectives:
The main objectives for the complete period of the Optimizr project were the following:

– Review and update the state-of-the-art with the relevant new products that have appeared since the proposal preparation stage
– Gather the user requirements from the SME end users by means of surveys and other requirements elicitation techniques
– Define the use cases covering the whole list of features that the system must implement to meet the SME needs
– Develop a Full System Specification, containing all the requirements for the system, both functional and non-functional.
– Design and implement the information extraction module, which collects data from the social media sources, using APIs where available.
– Define the methodology and implement the semantic analysis module, which aims to extract information from the collected text data: semantic features, named entities, relationships and reactions, topic classification.
– Develop the user interface for communicating all the relevant information to the user, in a clear and understandable way, providing effective visualizations of the data.
– Provide a database solution for storing all the relevant information, including the data collected from the Web and the results of the execution of each processing module.
– Develop of server infrastructure, suitable for the expected data volumes and traffic.
– Define the methodology and implement the social network analysis module, which will facilitate quantitative and qualitative analysis of social networks.
– Develop a methodology for Data Mining, aimed at identifying the factors that influence how the information is spread in all kinds of networks.
– Develop knowledge base models, based on SME expertise in marketing campaign execution
– Create an information diffusion model which models the propagation from person to person through social network, which will be used for KPI prediction
– Develop a Decision Support System that provides user with recommendations on what changes should be made in the campaign definition for increasing the campaign impact.
– Integrate all the developed modules using efficient inter-module communication with the aim to build a complete working prototype.
– Perform system validation in order to determine whether the system works correctly and meets the users requirements.
– Perform training and dissemination activities in order to facilitate the take-up of results of the project by the SMEs.
– To perform Knowledge Management tasks and develop a plan for IPR protection
– To finalize the PUDF and effectively disseminate the project’s scientific and technical results.

To achieve the current results, the objectives set for the Optimizr project were:

WP1
The whole consortium worked together to define the functional specifications of the project. These specifications were the main guidelines of the entire project development. During the functional requirements definition, some important issue were decided, such as how to overcome the social media API limitations, which social media are more relevant for the SMEs or how to define a campaign. Also, some use cases were defined by the SMEs with more experience in real social media campaigns.

WP2
The Optimizr system is based on a set of data extracted from social media networks, like Twitter. This information is collected by an information extraction module developed during this WP. After the extraction of the data, some pre-processing are required, in order to filter noisy data. This pre-processing is based on NLP methodologies to extract concepts, topics, etc.

WP3
User interfaces (UIs) are a basic functionality of the project, because the interaction with the user is a key element to ensure the success of the end results. The visualizations included in these UIs are important to communicate the information efficiently and Dashboards have the goal to provide a summary of all the campaign KPI values and predictions.

WP4
Databases were developed and tested during this WP. After an intense study of the different possibilities, the project RTDs decide that Elasticsearch was the better solution to one of the project challenges: a huge amount of data that has to be stored in json format (the format of the APIs’ outcomes). Besides that, a Management System that will allow the users to execute actions on the various social media services was built.

WP5
The main scientific goal of this WP is to create methodologies to extract information from social media contents and structures. For these objectives, two main methods were developed: Semantic Analysis and Social Network Analysis. Semantic Analysis extracts features from the social media contents, based on the main techniques of Natural Language Processing. Social Network Analysis extracts features from social media structures (links between users), mainly what user is more influential. Finally, a machine learning model built a predictive model of how a piece of content will spread across the social media network, by learning with a training set how these features correlate with the spread of a message.

WP6
Once the scientific methods were fully developed, an optimization system was created by: 1) describing the campaigns by a set of attributes and so creating a knowledge base. 2) developing an information diffusion model to understand how a message spreads across a network, once a campaign has already started. 3) creating a decision support system that recommends actions to optimize the impact of the campaign.

WP7
After the development of the previous technical and scientific modules, an integration phase was required to connect all of them. Then a first fully working version was developed in order to be tested by the SMEs.

WP8
Once the first prototype was created, a validation procedure was implemented in order to test the whole solution and improve the performance of it, from a functional, technical and usability point of view. After that, an iteration process were developed, with the idea to check improvements of the new versions.

WP9
Training activities were delivered by the RTDs throughout the whole project, mainly during the final phase of the project. These activities had the goal that the SME participants will be able to assimilate the results of the project, both from an owner and from a user point of view.
At the same time, a number of dissemination activities were carried out, mainly conferences, presentations and workshops, but also scientific papers, among them with the Optimizr information diffusion methodology published in the prestigious Frontiers in Physics.

WP10
The objective of the preparation for exploitation was to support the participating SMEs in protecting and using the research results to their best advantage and exploit the results in an effort to increase the competitiveness of the SME participants. All this effort was documented in the report on technology watch, foreground protection and IPR.

Project Results:
The Optimizr project started in September 2014 and lasted for 24 months. During the duration of the project, a productive collaboration between the SMEs and the RTDs has been achieved, with the SMEs actively following the research and development of the project and attending and participating in the technical and general meetings, which enhanced the process with their feedback and practical knowledge.
WP1
During the first 6 months of the projects, all the SMEs participated in the study for a better
understanding of the requirements of the product.
A study was made of existing system specifications, methods and processes of the competitors. Also a state of the art of similar tools was updated during the whole project.
Feedback was collected through the use of focus groups, face to face interviews, online surveys to all the members of the consortium. The output of these actions was used to define the system specification and requirements.
Furthermore, a preliminary use case test was defined by OVER, WAVE, and ICON. These use cases will be used in WP8 to verify that the developed product fulfils the user requirements defined during this WP.
The deliverable related to this WP, Full System Specifications describes the requirements for the system, decided during this phase of the project by the whole consortium.
WP2
A module that gets information from the social media APIs were developed over 7 months. Initially, the module integrated APIs from Twitter and Facebook. But some changes in the Facebook public API caused that the information provided by this API to become unusable, because of its limitations to public data.
The API Management System is built to be modular in order to easily add or change new social media APIs.
After the data acquisition, a phase of data pre-processing is carried out. This pre-processing contains some functionalities like text pre-processing (wit the idea of clean and tokenize the contents), topic classification (to filter the contents according to some topics defined by the user) and Part-of-speech tagger to recognize the POS of the terms.
The deliverable of this WP describes how the APIs of the several information providers are integrated and how the information is extracted from them. Information on how new APIs can be added is explained in the deliverable Plan for Future Development.
WP3
This WP covered the development of user interfaces, so its duration was nearly all the 24 months of the project. It includes:
Usability studies were useful to test if the user interface are user-friendly and the look and feel is attractive. All the SMEs were involved in this task, mainly ACCURAT due to their expertise in UI design.
During the first period the user interface was designed, following the common standards and taking into account the results of the usability studies. The mockups were developed and agreed within the consortium. In the second period of the project the user interface was implemented using Hypertext Markup Language (HTML), Cascading Style Sheets (CSS) and JavaScript, with help of third-party libraries.
In order to communicate all the information to the user in a clear and effective way, data visualizations were implemented. Different types of visualizations were developed for different kinds of data, including graphs, line charts, gauge charts. For developing some third-party libraries were used, such as D3, a powerful open source visualization libary.
User interface includes a dashboard, that provide the views of KPIs, defined for the campaign. The dashboard consists of widgets – graphical or textual representation of the KPI values. The dashboard is flexible, and it allows the user to add widgets, by selecting which KPI to show and what should be the visualization type. Predicted values for the KPIs are also shown on the dashboard.


WP4
The main objective of this work package was to create the databases that would store all the relevant information in the system. For different kinds of the stored information, different types of databases are more suitable. An extensive survey of the existing database technologies was performed in order to find the best solutions.
For storing the data extracted from the Web (social network data and blogs messages) and the corresponding results of the semantic analysis Elasticsearch was chosen; it combines the features of the NoSQL databases and the search engines. The documents are stored in JSON format.
For the rest of the data a relational database is used, PostgreSQL was chosen as it is a free and robust RDBMS with rich features.
During the first period:
– the database solutions were selected;
– the information organization principles and the JSON structure for data saved in elasticsearch was defined;
– the database schema was designed for PostgreSQL.
The results of the work are presented in the Deliverable 4.1.
During the second period the PostgreSQL schema was extended in order to support new features and modules, developed in the 2nd period.
Another objective of this work package was to develop a suitable server infrastructure, according to the expected volumes and traffic. In the first period an extensive survey of the existing computing platforms was performed (including Apache Hadoop, Spark, Storm), the report on the comparison was provided in the Deliverable 4.1. In the second period, after further evaluation Apache Storm was selected as the main platform for semantic analysis and prediction modules, due to its ability to process the data in real-time, and scaling capabilities, which will allow to scale the solution in case the data load increases in the future.
WP5
Semantic analysis
– Feature extraction
– Entity extraction
– Topic classification
– Extraction of relationships and reactions

Social Network Analysis

Data Mining
The objective of this task is to create a Machine Learning Model Builder, i.e. a system that creates ML models given a training set. This system will have the following modules:
Phase 1: Definition of the data mining objective. The general objective is to predict the impact of a tweet, message, etc. but we have to define the specific method to measure it.
Phase 2: Data understanding. Get familiar with the data, to identify data quality problems, to discover first insights into the data, or to detect interesting subsets to form hypotheses about hidden information. Data Mining model will be based on information about the content and the author: 2.1 Content: semantic features. 2.2 Network analysis features
Phase 3: Data Preparation. Includes all activities required to construct the final data set (data that will be fed into the modeling tool) from the initial raw data. Tasks include table, case, and attribute selection as well as transformation and cleaning of data for modeling tools.
Phase 4: Modeling. Select and apply a variety of modelling techniques, and calibrate tool parameters to optimal values. Typically, there are several techniques for the same data mining problem type. Some techniques have specific requirements on the form of data. Therefore, stepping back to the data preparation phase is often needed.
Phase 5: Development. Implementation of the data mining algorithms to create the models according to the training set data.
Phase 6: Evaluation. Thoroughly evaluate the model, and review the steps executed to construct the models, to be certain it properly achieves the objectives and then select the best models according to this evaluation.
Phase 7: Deployment (Demo). This ML model builder will have to create models according to: Input: content of the message and network to evaluate impact training set. Output: model to predict expected impact (as some of the indicators defined during the definition phase).

WP6
This WP contains 3 modules: the Knowledge Based models, the Information Diffusion model and the decision support system.
The Knowledge Base models are basic predictive models. They are useful when there is no training set to this kind of campaign. A model was created for each KPI. The process contains the following steps:
Definition of campaign features. Variables that define the campaign.
Knowledge acquisition. Knowledge acquisition is the extraction and formulation of knowledge derived from experts (the SMEs) with a set of surveys.
Knowledge Base. A knowledge base has a description of the elements in the process along with their characteristics, functions, and relationships. It also contains rules about the actions to implement as a result of certain events.
Development of the module. Software for creating and maintaining a knowledge base.

The Information Diffusion Model is the model that predict how KPI evolves according to their previous values. Phases of this module:
Definition of the objective. Identification of a problem and analysis of the requirements of the situation.
Data understanding. Identification of the variables for the model. Forecasting variables or parameters is part of the construction of the DSS.
Data Preparation. There is a need to specify assumptions and prepare any needed forecasts variables.
Optimization Model. The Optimization analysis model is the model for finding an optimum value for selected variables given certain constraints. The optimum value is defined by the impact variables chosen as key variables.
Model evaluation is the process of comparing a model's output with the actual behavior of the phenomenon that has been modeled.

The decision support system will recommend actions to optimize the campaign impact. The steps were:
Identification of objectives and resources. Identification of specific objectives and available resources identified.
Decision Parameters. Definition of the parameters that allow users to define the type of recommendations.
Genetic algorithm. Method to observe how changes to selected variables affect the outcome of the campaign and compare the expected impacts.

WP7
The main objective of this WP is to integrate all the previously developed modules and create a first version of the product. The work package started in the second period and continued until the end of the project. The modules developed in the work packages WP2, WP3, WP4, WP5 and WP6 have been integrated, using two main approaches: message-based communication using message queues, and data exchange using databases. Message queues provide asyncronous communication between the modules: one module sends control messages or data to a specific queue, while the other module continuously listens to this queue, waiting for new messages. When a message is received, it is processed, and the results are saved. The format used for the messages is JSON, For each inter-module communication the structure of the messages were defined, containing the required and optional fields and the types of each field. RabbitMQ is used as a message broker, client libraries are used for each module, depending on the programming language it is coded in.
The data exchange between the modules is also performed using databases. Different module use the same databases: one module saves the results of its execution to a table (in PostgreSQL) or index (in elasticsearch), and all other modules can have access to this data. The integration of the PostgreSQL database with the campaign management system with the is based on the Django models, and with the rest of the modules using PostgreSQL drivers. Integration with elasticsearch is made using elasticsearch client libraries.
This WP also included the deployment task, consisting in the installation and deployment of the first fully working release of the system with the goal to allow end-user SMEs to start using the system. One of the first steps made for this task was the creation of the code repository in a version control system. Github was chosen as a repository hosting service, and all the code for the project has been submitted to a private repository. All interested RTD and SME representatives were given access to it.
A demo server was created in the first period of the project. First it was used to demonstrate the progress made on separate work packages, and later in the project the prototype of the final system was deployed and made accessible to the consortium. The prototype was updated as new modules were being developed, until it reached its final state.
The instructions on system installation and configuration were provided in the code repository. Along with this, for thos partners who wish to use a configuration management system, installation scripts for Puppet utility were developed, which allow to install all the necessary software and dependencies automatically.

WP8
Test Cases
SMEs have defined test cases of real campaigns. Then, the entire consortium tested the demo version of the product, in order to evaluate the following key quality factors:
– Efficiency - The amount of computing resources and code required by a program to perform its function by the RTDs.
– Complexity - If the constructs can be used more effectively to decrease the architectural complexity, by the RTDs.
– Understandability - Test if it is easily understood by the SMEs.
– Testability/Maintainability - Effort required for locating and fixing an error in a program, as well as effort required to test a program to ensure that it performs its intended function by all the consortium.

System Validation
Validation consists of:
Scientific validation, concerning all the scientific modules of the project, including: semantic analysis, social network analysis and predictions. All these methods have been tested and their precision are very similar to the state of the art.
Technical validation, from a programming point of view, checking the performance and scalability of all the back end infrastructure.

Plan for Future Development
Includes the definition a strategy for future updates of the product, including:
– Adaptation to new social media sources.
– Adding to new languages.
– Adding new KPIs to the dashboard.
– Adding new visualizations to the dashboards.
– Scaling the tool to a higher volume of data.

WP9
Training
Training activities have been performed by the RTD performers, in order the transfer the required technical and managerial knowledge to all the consortium SMEs. Some training session have been performed to maximise the comprehension of the Optimizr technology by the SME participants during the whole project. Additionally an Optimizr user guide has been provided to the SMEs as a reference in the use of the platform.

Dissemination
The project website (optimizr.eu/) a website made for the project was developed during the first three months of the Project. At the beginning of the project the visual identity for the project was chosen, the consortium decided between different logo proposals.
As part of the dissemination activities the partners of the Optimizr consortium have hosted and attended several events to disseminate the project results. In addition, several publications, both scientific and non-scientific, were carried out to disseminate the project results.

Potential Impact:
Improved SME competitiveness

Lack of the ability to properly optimise and measure campaigns is the biggest hindrance for SME marketing agencies looking to provide social media marketing services to their clients, as ROI is extremely hard to prove. Much of the optimisation work must be done by trial and error, and even worse, manually. Optimizr will take the guesswork out of social media strategy planning by providing a series of benchmarks for success, and allow for finely-tuned segmentation and targeting for total optimisation of campaigns, saving
SME marketing agencies hundreds thousands of Euros per year in human resources and revenue losses. While lack of manpower has traditionally posed a problem for SMEs looking to serve large clients (such as multinational companies and brands), new technologies in social media monitoring and information ¡extraction is beginning to open doors for small businesses in ways never before imagined. The Optimizr project will ensure that SMEs take full advantage of the opportunity that technology presents them by providing them with a comprehensive tool that allows them to 1) effectively handle and make sense of the “data deluge” of text data from social networks 2) plan campaigns and make decisions to better promote brands and protect online reputation 3) optimise campaigns to ensure a high return on investment and 3) create benchmarks that will help achieve future success in social media marketing. Below is a table listing the key assets of the system which currently do not exist in the open market and how they respond to real user needs.


Economic Impact on Partner SMEs

With regard to the SMEs in the Optimizr consortium, the successful commercialisation of the product will bring significant economic benefits to consortium SMEs and in addition to the economic savings they will enjoy by the sharp increase in efficiency that Optimizr will provide them. As reported by consortium members WAVE, ICON and OVER, time-consuming manual data mapping, collection and analysis of social media campaigns cost SME marketing agencies and other businesses thousands of hours in employee time. SMEs using Optimizr will obtain an economic impact on their activity by reducing the amount of effort currently used in marketing campaigns, particularly strategy planning and analysis above all in optimisation. This economic impact has been estimated by a study performed by the partners before preparing the proposal among end users in the consortium (and also outside of the consortium). It was found that on average, consortium SMEs WAVE, ICON and OVER (and their colleagues) spend 30% of their time planning social media campaigns and 35% of their time manually measuring, analysing and optimising them. We have estimated that the use of Optimizr will reduce this time significantly (our conservative estimation is that campaign planning time will be reduced from 30% to 10% and time spent analysing and optimizing marketing campaigns from 35% to 10%). Considering that almost 90% of European marketing SMEs have 1-5 people dedicated to these tasks, the potential impact of the Optimizr system would be, on average, in the range of 5,000-40,000€ per year per employee. According to the statistics on number of employees dedicated to these tasks, the average economic impact per company has been calculated as €150,000/year.

In addition to the increased competitiveness via optimization and efficiencies gained, the Optimizr project results will afford the SMEs with tool to exploit as a new business line, as well as a number of methods and technologies developed under the project which can be used to power their own products and services.

The Impact on Society and Policy
The social and potential policy impact of social media tools such as Social Networks is already being studied by the Commission, evidenced in its Article 29 Working Party On Online Social Networking and recent agreements with major social networking companies to improve the safety of minors, though other policy moves have yet to be made in the area of social media. During a speech at the Safer Internet Forum, former EC
Information Society and Media Commissioner Viviane Reding argued the case for the use of social media tools for business, stating her belief that the use of social networks in the enterprise has the “potential to increase sales, improve customer engagement and increase worker productivity”. To that end, the Optimizr project will positively impact society by allowing European companies, particularly SMEs, to increase their competitiveness through the use of social media applications to promote their business and communicate with current and potential customers, helping strengthen European businesses as a whole. The Optimizr project will also support the goals of The Lisbon Strategy, in which it is was recognised that information and communications technologies (ICT) have a vital role to play in achieving them, as well as help support the goals of EC’s Digital Agenda for Europe, which encourages digital entrepreneurship for competitiveness.

List of Websites:
http://optimizr.eu/