Comments (6)
I was able to compile it after changing
import fs2._ from import fs2.Task
But run fails!
object WikiGo {
def main(args: Array[String]) {
val crawler = new myCrawler
// crawl wikipedia sequentially and take 10 elements (titles of visited websites)
val titles: Vector[String] = crawler.sequentialCrawl("https://wikipedia.org").take(10).runLog.unsafeRun
println(titles)
}
[IJ]> compile
[success] Total time: 0 s, completed Sep 8, 2017 11:12:51 PM
[IJ]> run
[info] Running WikiGo
[error] (run-main-5) java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
at com.marekkadek.scrawler.crawlers.Visit.(crawlers.scala:12)
at com.marekkadek.scrawler.crawlers.Crawler.sequentialCrawl(crawlers.scala:35)
at WikiGo$.main(WikiGo.scala:5)
at WikiGo.main(WikiGo.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
[trace] Stack trace suppressed: run last compile:run for the full output.
java.lang.RuntimeException: Nonzero exit code: 1
at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last compile:run for the full output.
[error] (compile:run) Nonzero exit code: 1
[error] Total time: 0 s, completed Sep 8, 2017 11:12:57 PM
[IJ]>
from scrawler.
Hi,
you probably have a dependancy problem, (the code is fine)
can you share with us your build.sbt ?
also you created jar file for this project ? why not just add it as a dependancy in build.sbt?
An to your first comment, this project/crawler basically emits an fs2.stream and for that you typically need to import fs2.{Strategy, Stream, Task}
from scrawler.
git pull the source
git clone [email protected]:KadekM/scrawler.git
** I used the sbt package to create jar files # this was the wrong move **
created a project
.
├── build.sbt (see below)
├── lib (*the compiled jar files go here ** not necessary & cause of the error **)
├── project
│ └── build.properties (specify the sbt version 0.13.15 in my case)
└── src
└── main
└── scala (*myCrawler.scala goes here)
ran intellij import by sbt
created a myCrawler.scala
copied class
and added
import com.marekkadek.scraper.Document
import com.marekkadek.scraper.jsoup.JsoupBrowser
import com.marekkadek.scrawler.crawlers.{Crawler, Visit, Yield, YieldData}
import fs2.{Strategy, Stream, Task} ** this solves two errors below **
However, I am stuck on 2 errors
- override protected def onDocument(document: Document): Stream[Task, Yield[String]] = {
Task Takes Type Parameters -
Stream.emit(title) ++ Stream.emits(followableLinks)
Cannot resolve ++, emit, emits
Complete newbie myself. So, I am stuck here.
from scrawler.
Hi:
Here's my build.sbt
name := "ScraperProject"
version := "1.1"
scalaVersion := "2.11.8"
libraryDependencies += "com.marekkadek" %% "scrawler" % "0.0.3"
Ah..., the jar files were somehow was causing problems!
I removed the lib directory (with jars) from the root dir & it ran fine.
I guess one cannot use libraryDependencies & jar files at the same time. My 1st Scala external lib dependent compiled & ran!
Thanks u much,
from scrawler.
Hi, no problem, happy crawling
from scrawler.
@yelled1 yes, just use sbt for dependency managment :)
feel free to open issue if you encounter any.
from scrawler.
Related Issues (15)
- Performance tests
- Unit tests
- Typed CSS selectors
- XPath selectors
- Tutorial / Docs
- Suggestion: add support for changing IP address HOT 1
- Scaladocs
- Retrying
- Telnet into running crawler
- Code coverage
- Handling errors during connection in browser HOT 1
- Spurious proxy tests
- parallelCrawl emits visits/stream-results in one chunk
- Support state inbetween calls
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scrawler.