Coder Social home page Coder Social logo

loklak / loklak_server Goto Github PK

View Code? Open in Web Editor NEW
1.4K 1.4K 223.0 233.86 MB

Distributed Open Source twitter and social media message search server that anonymously collects, shares, dumps and indexes data http://api.loklak.org

License: GNU Lesser General Public License v2.1

Shell 0.89% HTML 1.49% CSS 0.45% JavaScript 1.55% Java 94.52% Batchfile 0.01% Ruby 0.02% Python 0.45% Perl 0.25% Scala 0.19% Dockerfile 0.18%

loklak_server's People

Contributors

aneeshd16 avatar ansgarschmidt avatar daminisatya avatar dengyiping avatar djmgit avatar fatimarafiqui avatar hpdang avatar imujjwal96 avatar jigyasa-grover avatar kapillamba4 avatar kavithaenair avatar leonmak avatar marcnause avatar mariobehling avatar niccokunzmann avatar orbiter avatar pythad avatar rmader avatar seadog007 avatar sevazhidkov avatar shiftsayan avatar shivenmian avatar simsausaurabh avatar singhpratyush avatar smokingwheels avatar sudheesh001 avatar vibhcool avatar yasoob avatar yukiisbored avatar zyzo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

loklak_server's Issues

Implement a profanity filter for the tweets

Since loklak_webclient is going to be mainly used at conferences on a public wall, it'd be best to avoid outliers in the tweets (i.e. the ones which are unrelated / the one with profanity ) What could be a possible option is to have a flag during the query
/api/search.json?q=<string>&profanityFilter=false
Filtering such content should be on by default unless explictly mentioned as above with the query.

I think this enhancement will help the event organizers avoid embarassing situations / trolls who want to sabotage the twitter wall at a conference.

Any other views, @loklak/owners ?

Server account says submitted data is not well formed for perfectly well formed JSON

HTTP ERROR: 400

Problem accessing /api/account.json. Reason:
submitted data is not well-formed: account data must either contain authentication details or an apps setting

screen shot 2015-06-06 at 6 57 09 pm

I have the latest build at af0b7ac
ran ant and then bin/start.sh

Temporarily there's an error handler here but the session is maintained.
https://github.com/loklak/loklak_webclient/blob/0678b3f5196470f9b366c14141cf909bf6850bca/server/index.js#L67

java.net.UnknownHostException: twitter.com

Caused by overloading (approx 2 second search interval) and using up memory/swap file.
Fix Reboot machine start Loklak again plus overwrite the log file.
On Debian 7 64 bit VM. A forwarded peer.

2015-04-17 02:43:24.718:INFO::qtp625817278-29: /api/search.json?q=instagram&timezoneOffset=-480&maximumRecords=100&source=twitter&minified=true
java.net.UnknownHostException: twitter.com
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:625)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.protocol.https.HttpsClient.(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:933)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at org.loklak.api.ClientHelper.getConnection(ClientHelper.java:51)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:67)
at org.loklak.DAO.scrapeTwitter(DAO.java:625)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

Crawler does not run

I try curl "http://127.0.0.1:9000/api/crawler.json?start=loklak&depth=3" but the crawler stops after one hastag.

Exception in thread "Thread-11" java.lang.NullPointerException
at org.loklak.data.QueryEntry.toJSON(QueryEntry.java:226)
at org.loklak.data.AbstractIndexEntry.toString(AbstractIndexEntry.java:45)
at org.loklak.data.AbstractIndexEntry.toJSON(AbstractIndexEntry.java:55)
at org.loklak.data.AbstractIndexFactory.writeEntry(AbstractIndexFactory.java:100)
at org.loklak.data.DAO.scrapeTwitter(DAO.java:733)
at org.loklak.Crawler.process(Crawler.java:65)
at org.loklak.Caretaker.run(Caretaker.java:97)

Custom source_type values

When creating a MessageEntry with value not listed in SourceType enum, the value of source_type is set to USER.
What if I want to add other source types ? Is it mandatory that it has to be added to SourceType enum? Another way to do this is allow custom values to be added and searched.

ping frequently

every peer shall ping the back-end frequently (like: once every 5 minutes) to make the peer availability visible even after a restart of the peer. We need this since we don't store a peer map or list anywhere.

enrich tweet with geolocation

geolocation coordinates shall be attached to tweets. The source of these coordinates can be:

  • import from geoJson or similar resources
  • auto-detection from tweet locations using a location dictionary
  • auto-detection from tweet text using a location dictionary

The source of the locations will be set in the tweet json to make it possible to distiguish exact locations from imports or self-assigned coordinates from dictionary lookups.

Failed to execute phase [query] all shard

After Startup does not happen again.
Not sure if hardware fault but peer is restarted with searches being asked of it during startup.

2015-04-17 03:41:52.533:WARN::qtp403370592-42: Failed to execute phase [query], all shards failed; shardFailures {[Rm52FeIGSlSSmooer2pPYw][messages][0]: SearchParseException[[messages][0]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][0]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][1]: SearchParseException[[messages][1]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][1]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][2]: SearchParseException[[messages][2]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][2]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][3]: SearchParseException[[messages][3]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][3]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][4]: SearchParseException[[messages][4]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][4]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][5]: SearchParseException[[messages][5]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][5]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][6]: SearchParseException[[messages][6]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][6]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][7]: SearchParseException[[messages][7]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][7]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to execute phase [query], all shards failed; shardFailures {[Rm52FeIGSlSSmooer2pPYw][messages][0]: SearchParseException[[messages][0]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][0]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][1]: SearchParseException[[messages][1]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][1]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][2]: SearchParseException[[messages][2]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][2]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][3]: SearchParseException[[messages][3]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][3]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][4]: SearchParseException[[messages][4]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][4]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][5]: SearchParseException[[messages][5]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][5]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][6]: SearchParseException[[messages][6]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][6]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][7]: SearchParseException[[messages][7]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][7]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:238)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onFailure(TransportSearchTypeAction.java:184)
at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:565)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

org.eclipse.jetty.io.EofException cause by java.io.IOException: Broken pipe

Working fine.
Have had this error for some time since end of march.

2015-04-17 22:15:02.421:WARN::qtp443484261-44:
org.eclipse.jetty.io.EofException
at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:192)
at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:408)
at org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:364)
at org.eclipse.jetty.io.SelectChannelEndPoint.onSelected(SelectChannelEndPoint.java:111)
at org.eclipse.jetty.io.SelectorManager$ManagedSelector.processKey(SelectorManager.java:636)
at org.eclipse.jetty.io.SelectorManager$ManagedSelector.select(SelectorManager.java:607)
at org.eclipse.jetty.io.SelectorManager$ManagedSelector.run(SelectorManager.java:545)
at org.eclipse.jetty.util.thread.NonBlockingThread.run(NonBlockingThread.java:52)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)
Caused by:
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:492)
at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:170)
at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:408)
at org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:364)
at org.eclipse.jetty.io.SelectChannelEndPoint.onSelected(SelectChannelEndPoint.java:111)
at org.eclipse.jetty.io.SelectorManager$ManagedSelector.processKey(SelectorManager.java:636)
at org.eclipse.jetty.io.SelectorManager$ManagedSelector.select(SelectorManager.java:607)
at org.eclipse.jetty.io.SelectorManager$ManagedSelector.run(SelectorManager.java:545)
at org.eclipse.jetty.util.thread.NonBlockingThread.run(NonBlockingThread.java:52)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

Apostrophe shows up as 9 in JSON.

One of my tweets contains the word it's, but in the JSON I retrieved from loklak it shows up as 9. See screenshots below for illustration.

clipboard01

clipboard02

create a canonical_id field for cross-posting

when a rich-content tweet is generated, that tweet is cross-posted to twitter as well. That means for each rich-tweet, two tweets are generated within loklak: one which we assign ourself with rich content and another which is (possibly!) retrieved from twitter. In such cases, we must identify the rich tweet and the cross-posted tweet from twitter. our own tweet will get assigned a canonical_id field, which contains the id of the cross-posted tweert from twitter. To sort out double-tweets in search results, we just must follow the canonical-id and kick out all tweets with the id from that field.

To-do: add a canonical_id field. This field must be filled with a new API which accepts rich tweets from the webclient.

create a proxy servlet to support loading of user images

in #30 we describe the non-availability of user images due to blocking by twitter. We want to cache user images to fill the missing images as good as possible. This supports the feature described in fossasia/loklak_webclient#59.

The following process shall be implemented:

  • create an API servlet which provides a proxy functionality
  • the proxy shall especially provide the user images
  • to identify user images and assign them to users, request to the API must contain the screen_name of the user.

Create an openstreetmap tile-pattern with location tag painted in the center

A new API path 'vis' shall be created to host visualization servlets. One of these servlets shall provide OSM maps with information painted on it. Information shall be:

  • marker in the center
  • OSM license at the bottom
  • optional: date and headline at the top

The libraries added with 875b683 448d34e and 40a7bc8 shall be used. The servlets shall take the following attributes:

  • left, top, right bottom coordinates
  • text which will be added as message
  • flag which allows to print date of the message as well

Twitter Authorization: Backend Requirements

we already discussed the Option, that a Twitter user shall be able to authentify within loklak with his/her twitter account which would turn loklak into an Twitter app for that user.

What we need is

  • a web page which takes the log-in data from Twitter
  • a servlet which authentifies the user against twitter and establishes a user authentication against the loklak back-end
  • a framework for loklak web pages which uses that authentication, how is this done in angular.js?

We must discuss what you would expect from the server backend for authentication. Can you describe details?

Provide a settings servlet for loklak_webclient

Main purpose for such a servlet is, that application keys must be shared. We want to host such key in loklak_server and loklak_webclient can use the same keys too.

It shall work like this:

  • the loklak_webclient reads settings from their custom_configFile.json
  • it then reads a json from /api/settings.json. The json contains properties in the same format as custom_configFile.json. Every entity thats inside those properties from the server overwrites the settings as read from custom_configFile.json

Log file of 30 Gb "java.lang.IllegalArgumentException: Illegal group reference"

Does not seem to affect results but system gradually slows.
Running Loklak and Yacy on 2 Gb ram Debian 7.8.
Wrote a QB 64 program to dump records out when into raw data 500 000 lines
File size 31,718,676,586 bytes now deleted on loklak server but I have a copy.

"java.lang.IllegalArgumentException: Illegal group reference"
"at java.util.regex.Matcher.appendReplacement(Matcher.java:808)"
"at java.util.regex.Matcher.replaceFirst(Matcher.java:955)"
"at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)"
"at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)"
"at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)"
"at org.loklak.DAO.scrapeTwitter(DAO.java:519)"
"at org.loklak.api.server.SearchServlet$1.run(SearchServlet.java:89)"
"java.lang.IllegalArgumentException: Illegal group reference"
"at java.util.regex.Matcher.appendReplacement(Matcher.java:808)"
"at java.util.regex.Matcher.replaceFirst(Matcher.java:955)"
"at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)"
"at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)"
"at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)"
"at org.loklak.DAO.scrapeTwitter(DAO.java:519)"
"at org.loklak.api.server.SearchServlet$1.run(SearchServlet.java:89)"
"java.lang.IllegalArgumentException: Illegal group reference"
"at java.util.regex.Matcher.appendReplacement(Matcher.java:808)"
"at java.util.regex.Matcher.replaceFirst(Matcher.java:955)"
"at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)"
"at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)"
"at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)"
"at org.loklak.DAO.scrapeTwitter(DAO.java:519)"
"at org.loklak.api.server.SearchServlet$1.run(SearchServlet.java:89)"
"java.lang.IllegalArgumentException: Illegal group reference"
"at java.util.regex.Matcher.appendReplacement(Matcher.java:808)"
"at java.util.regex.Matcher.replaceFirst(Matcher.java:955)"
"at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)"
"at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)"
"at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)"
"at org.loklak.DAO.scrapeTwitter(DAO.java:519)"
"at org.loklak.api.server.SearchServlet$1.run(SearchServlet.java:89)"

Add a search constraint which can restrict to location boundaries

a bounding box shall be possible as search constraint for messages with locations.
an example of such a constraint would be /location=8.58,50.178,8.59,50.181
which means the syntax shall be /location=lon-west,lat-south,lon-east,lat-north
This syntax is very close to the one using in OpenStreetMap to extract bounding-box maps in png.

New Push API with support for other data formats

This Push API is different from current push.json in several points :

  • it supports others formats than search result's one (geojson, ..)
  • In POST parameters, users provide a link to their data file, not the data itself
  • For geojson type, a maptype parameter must be specified to map attributes in geojson property fields. The maptype consists of simple map rules geojson-field:loklak-tweet-field, separated by commas.

Related :
fossasia/loklak_webclient#205
fossasia/loklak_webclient#206
https://github.com/fossasia/api.fossasia.net/issues/45

provide push API for statical information

To add more sources than harvested by twitter, we want to add data from other sources including RSS feeds and geoJSON data. These sources must be added to the message index in the context of a lifetime flag #33

The data submitted to the API must therefore include:

  • URL of the source
  • data format of the source (i.e. RSS/GeoRSS/geoJSON etc)
  • a harvesting frequence (the submitter knows best how often the data is changed)
  • a lifetime. The lifetime must be smaller or equal to the harvester frequence. The lifetime is asserted to the index and it may mean that the data disappears from search results after that time. A special lifetime of 2^31-1 can be set to announce that the data is statical forever, like a normal 'news' message, or a location that will never change (i.e. place of a city)

Create an update.sh to read the config so that there is no down time after update.

I was updating my server and by accident I forgot to stop it.
So I updated the other 5 with the servers running.
I still did a start and stop at the end to bring my new config file online.
It would need further testing because I had 1/6 that failed but its a really old PC though.

I was able to Clone Copy and Build without error.
I used these commands .

cd /var/loklak_server
git clone https://github.com/loklak/loklak_server.git
cp -r loklak_server/* /var/loklak_server
ant
bin/stop.sh
bin/start.sh

Various Errors under constant load

res 24 apr 2015 zoom

I wrote a program to time how long a server takes to fill request for an RSS Feed because I was deleting the index folder on all my Peers to get rid of the errors listed below 3 times a day.
My servers have been running for 10 hours with the odd 503 error here or there but not filling the log file with things listed below at a fantastic rate to cause the VM to run out of space.
I will experiment with the DoS settings over the next few days to see what happens.
The pic a graph of loklak.org and how long it takes to complete a RSS feed request.
I ask every 10 seconds.

Some of the errors/WARN I have had over the past week with fixes.

DNS? Fix Destroy Virtual machine and get new IP..
Update Delete index folder.
java.net.UnknownHostException: twitter.com
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:625)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.protocol.https.HttpsClient.(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:933)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at org.loklak.api.ClientHelper.getConnection(ClientHelper.java:51)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:67)
at org.loklak.DAO.scrapeTwitter(DAO.java:625)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

Delete index folder fixes it.

2015-04-20 00:32:42.263:WARN::qtp1246952023-40: [messages][1] null
org.elasticsearch.action.NoShardAvailableActionException: [messages][1] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(TransportShardSingleOperationAction.java:175)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:155)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:89)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:55)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:193)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:201)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:91)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:65)
at org.loklak.DAO.getTweetMap(DAO.java:347)
at org.loklak.DAO.record(DAO.java:402)
at org.loklak.DAO.scrapeTwitter(DAO.java:634)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)
2015-04-20 00:32:42.650:INFO::qtp1246952023-40: /

Delete index folder fixes it.

java.lang.IllegalArgumentException: Illegal group reference
at java.util.regex.Matcher.appendReplacement(Matcher.java:808)
at java.util.regex.Matcher.replaceFirst(Matcher.java:955)
at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)
at org.loklak.DAO.scrapeTwitter(DAO.java:625)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

Delete index folder fixes it.

java.lang.IndexOutOfBoundsException: No group 8
at java.util.regex.Matcher.start(Matcher.java:374)
at java.util.regex.Matcher.appendReplacement(Matcher.java:831)
at java.util.regex.Matcher.replaceFirst(Matcher.java:955)
at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)
at org.loklak.DAO.scrapeTwitter(DAO.java:625)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

Connect remote peer to the same peer but another peer.
Eg Fowardends and backends are equal
Loklak still works ok

2015-04-18 11:13:01.683:WARN::qtp16373927-50:
org.eclipse.jetty.io.EofException
at org.eclipse.jetty.server.HttpConnection$SendCallback.reset(HttpConnection.java:610)
at org.eclipse.jetty.server.HttpConnection$SendCallback.access$100(HttpConnection.java:582)
at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:464)
at org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:766)
at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:804)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:142)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:135)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:373)
at java.io.OutputStream.write(OutputStream.java:75)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:203)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

After Startup does not happen again. Delete index folder
2015-04-17 03:41:52.533:WARN::qtp403370592-42: Failed to execute phase [query], all shards failed; shardFailures {[Rm52FeIGSlSSmooer2pPYw][messages][0]: SearchParseException[[messages][0]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][0]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][1]: SearchParseException[[messages][1]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][1]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][2]: SearchParseException[[messages][2]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][2]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][3]: SearchParseException[[messages][3]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][3]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][4]: SearchParseException[[messages][4]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][4]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][5]: SearchParseException[[messages][5]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][5]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][6]: SearchParseException[[messages][6]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][6]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][7]: SearchParseException[[messages][7]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][7]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to execute phase [query], all shards failed; shardFailures {[Rm52FeIGSlSSmooer2pPYw][messages][0]: SearchParseException[[messages][0]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][0]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][1]: SearchParseException[[messages][1]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][1]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][2]: SearchParseException[[messages][2]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][2]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][3]: SearchParseException[[messages][3]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][3]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][4]: SearchParseException[[messages][4]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][4]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][5]: SearchParseException[[messages][5]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][5]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][6]: SearchParseException[[messages][6]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][6]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }{[Rm52FeIGSlSSmooer2pPYw][messages][7]: SearchParseException[[messages][7]: query[+text:twitter],from[0],size[100]: Parse Failure [Failed to parse source [{"from":0,"size":100,"query":{"bool":{"must":{"match":{"text":{"query":"twitter","type":"boolean"}}}}},"sort":[{"created_at":{"order":"desc"}}]}]]]; nested: SearchParseException[[messages][7]: query[+text:twitter],from[0],size[100]: Parse Failure [No mapping found for [created_at] in order to sort on]]; }
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:238)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onFailure(TransportSearchTypeAction.java:184)
at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:565)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Working fine.
2015-04-17 14:36:42.500:INFO::qtp403370592-42: /api/search.json?q=ArianaGrande&timezoneOffset=-480&maximumRecords=100&source=twitter&minified=true
java.lang.IndexOutOfBoundsException: No group 7
at java.util.regex.Matcher.start(Matcher.java:374)
at java.util.regex.Matcher.appendReplacement(Matcher.java:831)
at java.util.regex.Matcher.replaceFirst(Matcher.java:955)
at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)
at org.loklak.DAO.scrapeTwitter(DAO.java:625)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

Fork errors compiling with ant.

I Forked a copy of Loklak and then modified 2 html files. At about 7 PM +8
I downloaded my fork and tried to compile it, lots of missing files.
https://github.com/smokingwheels/loklak_server
Just from looking at /var/loklak_server forkerror/src/org/loklak there is 7 files missing on an old working install.
What am I doing wrong?

Here is the list 20 errors.

Buildfile: /var/loklak_server/build.xml

init:

build:
[delete] Deleting directory /var/loklak_server/classes
[mkdir] Created dir: /var/loklak_server/classes
[echo] loklak: /var/loklak_server/build.xml
[javac] Compiling 43 source files to /var/loklak_server/classes
[javac] /var/loklak_server/src/org/loklak/data/MessageEntry.java:34: error: package org.joda.time.format does not exist
[javac] import org.joda.time.format.ISODateTimeFormat;
[javac] ^
[javac] /var/loklak_server/src/org/loklak/data/UserEntry.java:31: error: package org.joda.time.format does not exist
[javac] import org.joda.time.format.ISODateTimeFormat;
[javac] ^
[javac] /var/loklak_server/src/org/loklak/api/client/SearchClient.java:35: error: package com.fasterxml.jackson.core does not exist
[javac] import com.fasterxml.jackson.core.JsonFactory;
[javac] ^
[javac] /var/loklak_server/src/org/loklak/api/client/SearchClient.java:36: error: package com.fasterxml.jackson.core.type does not exist
[javac] import com.fasterxml.jackson.core.type.TypeReference;
[javac] ^
[javac] /var/loklak_server/src/org/loklak/api/client/SearchClient.java:37: error: package com.fasterxml.jackson.databind does not exist
[javac] import com.fasterxml.jackson.databind.ObjectMapper;
[javac] ^
[javac] /var/loklak_server/src/org/loklak/data/QueryEntry.java:34: error: package org.joda.time.format does not exist
[javac] import org.joda.time.format.ISODateTimeFormat;
[javac] ^
[javac] /var/loklak_server/src/org/loklak/data/MessageEntry.java:84: error: cannot find symbol
[javac] this.created_at = ISODateTimeFormat.dateOptionalTimeParser().parseDateTime(created_at_string).toDate();
[javac] ^
[javac] symbol: variable ISODateTimeFormat
[javac] location: class MessageEntry
[javac] /var/loklak_server/src/org/loklak/data/UserEntry.java:60: error: cannot find symbol
[javac] this.appearance_first = ISODateTimeFormat.dateOptionalTimeParser().parseDateTime(appearance_first_string).toDate();
[javac] ^
[javac] symbol: variable ISODateTimeFormat
[javac] location: class UserEntry
[javac] /var/loklak_server/src/org/loklak/data/UserEntry.java:66: error: cannot find symbol
[javac] this.appearance_latest = ISODateTimeFormat.dateOptionalTimeParser().parseDateTime(appearance_latest_string).toDate();
[javac] ^
[javac] symbol: variable ISODateTimeFormat
[javac] location: class UserEntry
[javac] /var/loklak_server/src/org/loklak/api/client/SearchClient.java:50: error: cannot find symbol
[javac] JsonFactory factory = new JsonFactory();
[javac] ^
[javac] symbol: class JsonFactory
[javac] location: class SearchClient
[javac] /var/loklak_server/src/org/loklak/api/client/SearchClient.java:50: error: cannot find symbol
[javac] JsonFactory factory = new JsonFactory();
[javac] ^
[javac] symbol: class JsonFactory
[javac] location: class SearchClient
[javac] /var/loklak_server/src/org/loklak/api/client/SearchClient.java:51: error: cannot find symbol
[javac] ObjectMapper mapper = new ObjectMapper(factory);
[javac] ^
[javac] symbol: class ObjectMapper
[javac] location: class SearchClient
[javac] /var/loklak_server/src/org/loklak/api/client/SearchClient.java:51: error: cannot find symbol
[javac] ObjectMapper mapper = new ObjectMapper(factory);
[javac] ^
[javac] symbol: class ObjectMapper
[javac] location: class SearchClient
[javac] /var/loklak_server/src/org/loklak/api/client/SearchClient.java:52: error: cannot find symbol
[javac] TypeReference<HashMap<String,Object>> typeRef = new TypeReference<HashMap<String,Object>>() {};
[javac] ^
[javac] symbol: class TypeReference
[javac] location: class SearchClient
[javac] /var/loklak_server/src/org/loklak/api/client/SearchClient.java:52: error: cannot find symbol
[javac] TypeReference<HashMap<String,Object>> typeRef = new TypeReference<HashMap<String,Object>>() {};
[javac] ^
[javac] symbol: class TypeReference
[javac] location: class SearchClient
[javac] /var/loklak_server/src/org/loklak/data/QueryEntry.java:111: error: cannot find symbol
[javac] this.query_first = ISODateTimeFormat.dateOptionalTimeParser().parseDateTime((String) map.get("query_first")).toDate();
[javac] ^
[javac] symbol: variable ISODateTimeFormat
[javac] location: class QueryEntry
[javac] /var/loklak_server/src/org/loklak/data/QueryEntry.java:112: error: cannot find symbol
[javac] this.query_last = ISODateTimeFormat.dateOptionalTimeParser().parseDateTime((String) map.get("query_last")).toDate();
[javac] ^
[javac] symbol: variable ISODateTimeFormat
[javac] location: class QueryEntry
[javac] /var/loklak_server/src/org/loklak/data/QueryEntry.java:113: error: cannot find symbol
[javac] this.retrieval_last = ISODateTimeFormat.dateOptionalTimeParser().parseDateTime((String) map.get("retrieval_last")).toDate();
[javac] ^
[javac] symbol: variable ISODateTimeFormat
[javac] location: class QueryEntry
[javac] /var/loklak_server/src/org/loklak/data/QueryEntry.java:114: error: cannot find symbol
[javac] this.retrieval_next = ISODateTimeFormat.dateOptionalTimeParser().parseDateTime((String) map.get("retrieval_next")).toDate();
[javac] ^
[javac] symbol: variable ISODateTimeFormat
[javac] location: class QueryEntry
[javac] /var/loklak_server/src/org/loklak/data/QueryEntry.java:115: error: cannot find symbol
[javac] this.expected_next = ISODateTimeFormat.dateOptionalTimeParser().parseDateTime((String) map.get("expected_next")).toDate();
[javac] ^
[javac] symbol: variable ISODateTimeFormat
[javac] location: class QueryEntry
[javac] 20 errors

add search index, dump file and visualization for statical data harvesting objects

Objects pushed with the API in #35 must be stored. The storage shall be done as elasticsearch index and also into a dump file for import in new peers.
The index shall contain the following fields (at least)

  • date when the harvesting API was accessed
  • IP from submitter
  • URL of the harvesting source
  • data format of the source (i.e. RSS/GeoRSS/geoJSON etc)
  • a harvesting frequence
  • a lifetime.

For the data format an ENUM field shall be created to normalize the format names.
This index is needed for #36

Loklak /api/peers.json on the website says the count is 0

Running the following query http://loklak.org/api/peers.json results in

{
  "count" : "0",
  "peers" : [ {
    "host" : "80.152.219.115",
    "port.http" : -1,
    "port.https" : -1,
    "lastSeen" : 1429177934409,
    "lastPath" : "/api/search.json",
    "peername" : "anonymous"
  }, 
  .
  .
  .

Where as the actual count of the number of "peers" i.e. Array size in this case is 69

Frontend pushed too hard

Search frequency from backend too high causes error frontend peer error
Settings DoS 2500 100
Recommended 3189 by my rough guess program.

2015-05-02 10:59:55.714:INFO::qtp1290568156-42: /api/search.json?q=google&timezoneOffset=-480&maximumRecords=100&source=twitter&minified=true
java.lang.IllegalArgumentException: Illegal group reference
at java.util.regex.Matcher.appendReplacement(Matcher.java:808)
at java.util.regex.Matcher.replaceFirst(Matcher.java:955)
at org.loklak.scraper.TwitterScraper$TwitterTweet.analyse(TwitterScraper.java:307)
at org.loklak.scraper.TwitterScraper$TwitterTweet.run(TwitterScraper.java:348)
at java.lang.Thread.run(Thread.java:745)

Data uptake slow and forward peer not responding to master peer

Dont know if it is a bug but 2 Forward Peers were giving very little data.
Fix. Fresh Debian and Loklak install.
Also 4.5 gig swap file now.
Not sure if hard reset corrupts data.

2015-04-20 00:32:42.263:WARN::qtp1246952023-40: [messages][1] null
org.elasticsearch.action.NoShardAvailableActionException: [messages][1] null
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(TransportShardSingleOperationAction.java:175)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:155)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:89)
at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:55)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:193)
at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:201)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:91)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:65)
at org.loklak.DAO.getTweetMap(DAO.java:347)
at org.loklak.DAO.record(DAO.java:402)
at org.loklak.DAO.scrapeTwitter(DAO.java:634)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)
2015-04-20 00:32:42.650:INFO::qtp1246952023-40: /

java.lang.IllegalArgumentException: Illegal group reference
at java.util.regex.Matcher.appendReplacement(Matcher.java:808)
at java.util.regex.Matcher.replaceFirst(Matcher.java:955)
at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)
at org.loklak.DAO.scrapeTwitter(DAO.java:625)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

java.lang.IndexOutOfBoundsException: No group 8
at java.util.regex.Matcher.start(Matcher.java:374)
at java.util.regex.Matcher.appendReplacement(Matcher.java:831)
at java.util.regex.Matcher.replaceFirst(Matcher.java:955)
at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)
at org.loklak.DAO.scrapeTwitter(DAO.java:625)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

aggregations : { } don't show up all the time on query with localhost

For the query http://localhost:9100/api/search.json?q=spacex%20since:2015-04-01%20until:2015-04-06&source=cache&count=0&fields=mentions,hashtags&limit=6

, {
    "created_at" : "2015-05-20T02:17:10.000Z",
    "screen_name" : "pauljeenu76143",
    "text" : "Believe is win. And it is all mipaltan never lose..... Afterall my favourite MI will win the IPL 2015... #CSKvsMI #All the best",
    "link" : "https://twitter.com/pauljeenu76143/status/600847705569648640",
    "id_str" : "600847705569648640",
    "source_type" : "TWITTER",
    "provider_type" : "REMOTE",
    "provider_hash" : "7fffffff",
    "retweet_count" : 0,
    "favourites_count" : 0,
    "images" : [ ],
    "images_count" : 0,
    "hosts" : [ ],
    "hosts_count" : 0,
    "links" : [ ],
    "links_count" : 0,
    "mentions" : [ ],
    "mentions_count" : 0,
    "hashtags" : [ "CSKvsMI", "All" ],
    "hashtags_count" : 2,
    "without_l_len" : 127,
    "without_lu_len" : 127,
    "without_luh_len" : 113,
    "user" : {
      "name" : "Paulashya Trivedi",
      "screen_name" : "pauljeenu76143",
      "profile_image_url_https" : "https://pbs.twimg.com/profile_images/581453260101316608/vwW4V_18_bigger.jpg",
      "appearance_first" : "2015-05-20T02:17:51.082Z",
      "appearance_latest" : "2015-05-20T02:17:51.082Z"
    }
  } ]
}

this is the end of the response JSON for the above query on localhost but the same on production result in
http://loklak.org/api/search.json?q=spacex%20since:2015-04-01%20until:2015-04-06&source=cache&count=0&fields=mentions,hashtags&limit=6

{
  "readme_0" : "THIS JSON IS THE RESULT OF YOUR SEARCH QUERY - THERE IS NO WEB PAGE WHICH SHOWS THE RESULT!",
  "readme_1" : "loklak.org is the framework for a message search system, not the portal, read: http://loklak.org/about.html#notasearchportal",
  "readme_2" : "This is supposed to be the back-end of a search portal. For the api, see http://loklak.org/api.html",
  "readme_3" : "Parameters q=(query), source=(cache|backend|twitter|all), callback=p for jsonp, maximumRecords=(message count), minified=(true|false)",
  "search_metadata" : {
    "startIndex" : "0",
    "itemsPerPage" : "0",
    "count" : "0",
    "hits" : 26,
    "period" : 9223372036854775807,
    "query" : "spacex since:2015-04-01 until:2015-04-06",
    "client" : "91.140.185.68",
    "servicereduction" : "false"
  },
  "statuses" : [ ],
  "aggregations" : {
    "hashtags" : {
      "spacex" : 4,
      "apple" : 2,
      "kca" : 2,
      "nasa" : 2,
      "space" : 2,
      "votejkt48id" : 2
    },
    "mentions" : {
      "AkulaEcho" : 1,
      "Conduru" : 1,
      "DaveDTC" : 1,
      "David_Lark" : 1,
      "JosieBaik" : 1,
      "NanotronicsImag" : 1
    }
  }
}

Is there a specific reason for this behaviour or is it a bug. I am upto date with the latest upstream code.

NoShardAvailableException when inserting new users to elastic

I got org.elasticsearch.action.NoShardAvailableActionException: [users][7] null when trying to insert users, not just messages to elastic.

Reproduce steps :

  1. Uncommenting this https://github.com/loklak/loklak_server/blob/new-push-api/src/org/loklak/api/server/GeoJsonPushServlet.java#L138
  2. Recompile & rerun loklak
  3. Execute this command (make sure the messages don't exist yet in db) :
curl -i -F callback=p \
        -F url=http://cmap-fossasia-api.herokuapp.com/ffGeoJsonp.php \
        -F map_type=shortname:screen_name,shortname:user.screen_name,name:user.name,url:link \
        http://localhost:9000/api/geojsonpush.json

What is strange is that the exception is not raised from the start, but rather after several successful insertions. Here's a snapshot of inserted users :

Stack trace :

2015-06-27 11:04:40.380:WARN:oejs.ServletHandler:qtp1740472127-59: /api/geojsonpush.json
org.elasticsearch.action.NoShardAvailableActionException: [users][7] null
    at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.perform(TransportShardSingleOperationAction.java:175)
    at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.start(TransportShardSingleOperationAction.java:155)
    at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:89)
    at org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:55)
    at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
    at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
    at org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:193)
    at org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:201)
    at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:91)
    at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:65)
    at org.loklak.data.AbstractIndexFactory.exists(AbstractIndexFactory.java:60)
    at org.loklak.data.DAO.writeMessage(DAO.java:501)
    at org.loklak.api.server.GeoJsonPushServlet.doPost(GeoJsonPushServlet.java:139)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
    at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
    at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:300)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
    at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.eclipse.jetty.server.Server.handle(Server.java:497)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
    at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
    at java.lang.Thread.run(Thread.java:745)

PR #56

RSS Feed Generator Returns zero bytes

About 5.5 hours ago the error occurred on loklak.org
Was working for at least 1 day before with the loads I was giving it.
150 searches every 10 min various yacy VM's.
24 searches every 5 min from feeddemon .
rss feed error

If loading of user avatar through the proxy fails, try to identify new url through twitter

This is a follow-up to #31 and connected to the idea in fossasia/loklak_webclient#59 (comment)

In case that the loklak_webclient retrieves an avatar picture using the proxy api and the proxy fails to load the picture, it might be possible that the user has meanwhile changed the image and a new url is valid. In that case the server shall use the user's authentication to retrieve the latest avatar url from twitter directly, update the account information (set the new url), download the image and return it.

Add a lifetime field to message index do dynamically distinguish events from static data

We want to feed realtime data and other data into the search index. While the appearance of such data can be seen as normal event (aka tweet), we can also use special visualization methods (like maps) to show such events after a long time. To measure how long such data shall be displayed, we assign a lifetime value to each tweet.
I.e. if a message is submitted every 1h to announce measurement of a 'thing', the frequency of such events is one per hour. The lifetime would be one hour. The data would be displayed for one hour and then be replaced by the follow-up data of that measurement.

to-do: add a field 'lifetime' to the message index.

Home page Load time can be 30% faster if images are reduced in color depth

Struggling on ADSL 2+ here is a slight performance increase if you need it.
My Debian VM host is 43% slower to load if all is left alone.

Results
Before http://www.webpagetest.org/result/150427_90_7BE/
After http://www.webpagetest.org/result/150427_0X_7K4/

I converted the following files to 256 color resolution.
loklak_anonymous.png
loklak_share.png
loklak_collect.png
It does put a white box and not match the home page color the cow also changes a little.
loklak_anonymous
loklak_collect
loklak_share

Input box too small on mobile devices

screen shot 2015-04-10 at 5 29 53 pm

I think its a good option to make the input box appear at the full length on the screen, and put the JSON and RSS button together in the next line.

In a Peer Group have the Backend respect the Frontends speed

Under high load the backend seems to ask a slow peer for a result too often.
A way around this is set the DoS settings of the backend to limit the request speed.
My rough formulator for a front Peers is the average response time + 800 ms - ping.

For backend timing.
I Add 500 ms to (front peer time) and divide by the number of forward peers.
You will see a rare 503 error in a peer sometimes when there too close together.

The program I use may work in Linux with url2file and QB64 but works as is on windows with url2file installed.
Loklak tune

java.lang.IndexOutOfBoundsException: No group 7

Working fine.
Peer has been under light load for a day.

2015-04-17 14:36:42.500:INFO::qtp403370592-42: /api/search.json?q=ArianaGrande&timezoneOffset=-480&maximumRecords=100&source=twitter&minified=true
java.lang.IndexOutOfBoundsException: No group 7
at java.util.regex.Matcher.start(Matcher.java:374)
at java.util.regex.Matcher.appendReplacement(Matcher.java:831)
at java.util.regex.Matcher.replaceFirst(Matcher.java:955)
at org.loklak.scraper.TwitterScraper$TwitterTweet.(TwitterScraper.java:282)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:165)
at org.loklak.scraper.TwitterScraper.search(TwitterScraper.java:70)
at org.loklak.DAO.scrapeTwitter(DAO.java:625)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:107)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

add harvester for statical data

To harvest the data submitted by #35 a process must periodically load the harvest URLs and add the harvested data to the message index. Harvested data must be transformed into the rich message format that is intended for locations, images, audio, video.

It is sort of an overloaded server

I have Loklak and Yacy running on the same virtual machine with about 50 RSS calls every 10 min.
I am in the process of creating a second server.

2015-04-01 13:19:36.835:WARN::qtp7189425-40:
java.util.ConcurrentModificationException
at java.util.TreeMap$DescendingMapIterator.makePrev(TreeMap.java:385)
at java.util.TreeMap$UnboundedDescendingEntryIterator.next(TreeMap.java:452)
at java.util.TreeMap$UnboundedDescendingEntryIterator.next(TreeMap.java:437)
at java.util.AbstractMap$2$1.next(AbstractMap.java:385)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:163)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)
java.util.ConcurrentModificationException
at java.util.TreeMap$DescendingMapIterator.makePrev(TreeMap.java:385)
at java.util.TreeMap$UnboundedDescendingEntryIterator.next(TreeMap.java:452)
at java.util.TreeMap$UnboundedDescendingEntryIterator.next(TreeMap.java:437)
at java.util.AbstractMap$2$1.next(AbstractMap.java:385)
at org.loklak.api.server.SearchServlet.doGet(SearchServlet.java:163)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83)
at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:309)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)
at java.lang.Thread.run(Thread.java:745)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.