Coder Social home page Coder Social logo

noise-to-opportunity-ml's Introduction

Noise-To-Opportunity Conversion

This is part of the Social Media Analysis seminar at Hasso-Plattner-Institute, Potsdam, Germany.

by Daniel Kurzynski, Dimitri Korsch, Stefan Bunk

###Description This tool is prototype to show a new approach for companies to find potential customers in social networks. By listening to noise from social network posts, we identify users, which express a demand for a certain product. We achieve this identification with a two-stage text categorization classifier: First, we detect whether the post expresses a demand for some product in general. Second, we detect, which product the post is about. By using the company's brochures, we minimize the integration effort of our system.

###Folders

The folder NTOClassification contains the project that should be used to analyze posts. The folder NTOTagger contains a webapp that can be used to create a gold standard for the evaluation of the NTOClassifier or for generating traning examples for the demand classifier.

Usage

It is a maven project. We recommend using it by installing it to your local repository.

<dependency>
	<groupId>com.blog_intelligence</groupId>
	<artifactId>nto</artifactId>
	<version>1.0</version>
</dependency>

####Classes

There are three important classes:

  • Document: Represents the object to learn and classifier: post oder brochure.
  • NTOClassifier: Predicts demand (predictDemand) and product (predictProduct for each document.
  • DocumentExtractor: Reads documents from file or database.

####Example Code

/**
 * Reading training data
 */
DocumentExtractor documentExtractor = new DocumentExtractor(
		new File("stopwords.txt"),
		new File("german-fast.tagger")
);

// Adapt files here if necessary.
ReadingResult csvDocs = documentExtractor.readFromCSV(
		new File("linked_in_posts.csv"),
		new File("brochures.csv"),
		new File("classification.json")
);

// Load documents from database. Can be used in the same way as csvDocs, or even combined with csvDocs.
ReadingResult dbDocs = documentExtractor.readFromDB(CONFIG);
// Like this:
List<Document> combined = new ArrayList<>();
combined.addAll(csvDocs.demandDocuments());
combined.addAll(dbDocs.demandDocuments());

/**
 * Building classifier
 */
NTOClassifier classifier = new NTOClassifier(
		new File("stopwords.txt"),
		new File("german-fast.tagger")
);

// Training
classifier.trainDemand(csvDocs.demandDocuments());
classifier.trainProduct(csvDocs.productDocuments());

//Prediction
String post = "Hi! I am the CTO of Startup Inc. Lately, I have problems organising my customers. " +
				"Do you have any recommendations for a good crm system to handle them?";

double probDemand = classifier.predictDemand(post);
System.out.println("Demand probability " + probDemand);

List<ProductClassification> probsProduct = classifier.predictProduct(post);
for (ProductClassification classification : probsProduct) {
	System.out.println(classification.product() + ": " + classification.prob());
}

The complete example can be found in the expample folder.

####Presistency You can persist the classifier model by calling: persistDemand and persistProducts on the NTOClassifier.

// Persisting for next run
classifier.persistDemand(DEMAND_MODEL_FILE);
classifier.persistProducts(PRODUCT_MODEL_FILE);

//Load persisted model
classifier.loadDemand(DEMAND_MODEL_FILE);
classifier.loadProduct(PRODUCT_MODEL_FILE);

noise-to-opportunity-ml's People

Contributors

daniel-kurzynski avatar knub avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.