Coder Social home page Coder Social logo

Readme basic usage about scrawler HOT 6 OPEN

KadekM avatar KadekM commented on July 28, 2024
Readme basic usage

from scrawler.

Comments (6)

yelled1 avatar yelled1 commented on July 28, 2024 1

I was able to compile it after changing
import fs2._ from import fs2.Task
But run fails!

object WikiGo {
  def main(args: Array[String]) {
    val crawler = new myCrawler
    // crawl wikipedia sequentially and take 10 elements (titles of visited websites)
    val titles: Vector[String] = crawler.sequentialCrawl("https://wikipedia.org").take(10).runLog.unsafeRun
    println(titles)
}

[IJ]> compile
[success] Total time: 0 s, completed Sep 8, 2017 11:12:51 PM
[IJ]> run
[info] Running WikiGo
[error] (run-main-5) java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
at com.marekkadek.scrawler.crawlers.Visit.(crawlers.scala:12)
at com.marekkadek.scrawler.crawlers.Crawler.sequentialCrawl(crawlers.scala:35)
at WikiGo$.main(WikiGo.scala:5)
at WikiGo.main(WikiGo.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
[trace] Stack trace suppressed: run last compile:run for the full output.
java.lang.RuntimeException: Nonzero exit code: 1
at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last compile:run for the full output.
[error] (compile:run) Nonzero exit code: 1
[error] Total time: 0 s, completed Sep 8, 2017 11:12:57 PM
[IJ]>

from scrawler.

visox avatar visox commented on July 28, 2024 1

Hi,

you probably have a dependancy problem, (the code is fine)

can you share with us your build.sbt ?

also you created jar file for this project ? why not just add it as a dependancy in build.sbt?

An to your first comment, this project/crawler basically emits an fs2.stream and for that you typically need to import fs2.{Strategy, Stream, Task}

from scrawler.

yelled1 avatar yelled1 commented on July 28, 2024

git pull the source
git clone [email protected]:KadekM/scrawler.git
** I used the sbt package to create jar files # this was the wrong move **
created a project
.
├── build.sbt (see below)
├── lib (*the compiled jar files go here ** not necessary & cause of the error **)
├── project
│   └── build.properties (specify the sbt version 0.13.15 in my case)
└── src
└── main
└── scala (*myCrawler.scala goes here)

ran intellij import by sbt
created a myCrawler.scala
copied class
and added
import com.marekkadek.scraper.Document
import com.marekkadek.scraper.jsoup.JsoupBrowser
import com.marekkadek.scrawler.crawlers.{Crawler, Visit, Yield, YieldData}
import fs2.{Strategy, Stream, Task} ** this solves two errors below **

However, I am stuck on 2 errors

  1. override protected def onDocument(document: Document): Stream[Task, Yield[String]] = {
    Task Takes Type Parameters
  2. Stream.emit(title) ++ Stream.emits(followableLinks)
    

Cannot resolve ++, emit, emits

Complete newbie myself. So, I am stuck here.

from scrawler.

yelled1 avatar yelled1 commented on July 28, 2024

Hi:

Here's my build.sbt

name := "ScraperProject"

version := "1.1"

scalaVersion := "2.11.8"

libraryDependencies += "com.marekkadek" %% "scrawler" % "0.0.3"

Ah..., the jar files were somehow was causing problems!
I removed the lib directory (with jars) from the root dir & it ran fine.
I guess one cannot use libraryDependencies & jar files at the same time. My 1st Scala external lib dependent compiled & ran!
Thanks u much,

from scrawler.

visox avatar visox commented on July 28, 2024

Hi, no problem, happy crawling

from scrawler.

KadekM avatar KadekM commented on July 28, 2024

@yelled1 yes, just use sbt for dependency managment :)
feel free to open issue if you encounter any.

from scrawler.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.