Coder Social home page Coder Social logo

Comments (29)

Alankar0416 avatar Alankar0416 commented on June 11, 2024 2

@dvijparekh1995 However we can take the class name and map it from there. But this will break when they change it again.

from justdial-scrapper.

vishnu1991 avatar vishnu1991 commented on June 11, 2024 2

There is a series used by JD to show phone number.
If we can extract the span>classname then we can get mobile numbers easily.

Series is as below
Number - span class="icon-XX"
1 - icon-yz
2 -icon-wx
3 -icon-vu
4 -icon-ts
5 -icon-rq
6 -icon-po
7 -icon-nm
8 -icon-lk
9 -icon-ji
0 -icon-acb

from justdial-scrapper.

ketanshah79 avatar ketanshah79 commented on June 11, 2024 2

Thanks @Alankar0416 for sharing the code.

Here is an array mapping I've used as a second pass on the csv file. I used the .find_all for phone number.

  • '<bound method Tag.find_all of ' => '',
  • '>' => '',
  • '<span class=""mobilesv icon-dc"">' => '',
  • '<span class=""mobilesv icon-fe"">' => '',
  • '<span class=""mobilesv icon-hg"">' => '',
  • '<span class=""mobilesv icon-ba"">' => '-',
  • '<span class=""mobilesv icon-acb"">' => '0',
  • '<span class=""mobilesv icon-yz"">' => '1',
  • '<span class=""mobilesv icon-wx"">' => '2',
  • '<span class=""mobilesv icon-vu"">' => '3',
  • '<span class=""mobilesv icon-ts"">' => '4',
  • '<span class=""mobilesv icon-rq"">' => '5',
  • '<span class=""mobilesv icon-po"">' => '6',
  • '<span class=""mobilesv icon-nm"">' => '7',
  • '<span class=""mobilesv icon-lk"">' => '8',
  • '<span class=""mobilesv icon-ji"">' => '9',
  • '<bound method Tag.find_all of ' => '',
  • '>' => '',

Attached is my php code.
clean_csv.php.txt

from justdial-scrapper.

krishnamalireddy avatar krishnamalireddy commented on June 11, 2024 1

I am getting urllib open timeout error. Is this code still working for anyone?

from justdial-scrapper.

Alankar0416 avatar Alankar0416 commented on June 11, 2024

I was able to earlier, but it seems they have started sending svg image instead of numbers.

from justdial-scrapper.

Alankar0416 avatar Alankar0416 commented on June 11, 2024

Yes, I had that in mind. But the issue is they can change the class name whenever they want and this will break then. Better to think of something concrete. The most foolproof solution is to use digit recognition on the image.

from justdial-scrapper.

vishnu1991 avatar vishnu1991 commented on June 11, 2024

yes i think the same. as the will surely change it.

from justdial-scrapper.

krishnamalireddy avatar krishnamalireddy commented on June 11, 2024

I'm not getting the phonenumbers. Can you tell me how to get phone numbers

from justdial-scrapper.

Alankar0416 avatar Alankar0416 commented on June 11, 2024

@krishnamalireddy JD is now using svg's in place of actual numbers. That's why parsing is getting failed. There are couple of ways to get around this.

Each svg's has a unique code which can be mapped - will fail if they change mapping again
Use a digit recognition over the svg.

Unfortunately I am not getting time to develop this. Will pick it up whenever I have some bandwidth.

from justdial-scrapper.

hrwndr avatar hrwndr commented on June 11, 2024

@Alankar0416 Could you please demonstrate, how can we implement the numbers from svgs in code?

from justdial-scrapper.

AdityaMalireddy avatar AdityaMalireddy commented on June 11, 2024

@Alankar0416 Could you please demonstrate, how can we implement the numbers from svgs in code?

simple solution is instead of using .string use .find_all for phone number.

You will get random code of svg's convert them

from justdial-scrapper.

Alankar0416 avatar Alankar0416 commented on June 11, 2024

The issue is we can to keep a map of svg code and number but it JD can change it anytime.

from justdial-scrapper.

AdityaMalireddy avatar AdityaMalireddy commented on June 11, 2024

Ha they can change it any time. If they have changed we have to decode it again. By the way they haven't changed it for a long time

from justdial-scrapper.

Alankar0416 avatar Alankar0416 commented on June 11, 2024

Great work @ketanshah79
Haven't tried this code. Are you able to successfully map phone numbers with this additional script? If yes, I can add this into the original script to make things easy for everyone.

from justdial-scrapper.

ketanshah79 avatar ketanshah79 commented on June 11, 2024

from justdial-scrapper.

Dhiren-Biren avatar Dhiren-Biren commented on June 11, 2024

only 10 data retrieving

from justdial-scrapper.

mps1305 avatar mps1305 commented on June 11, 2024

@Alankar0416 could you please post the code along with @ketanshah79 's changes?
Need to get justdial data for a college project.
Please guys, if either of you could do it, it'll be really helpful

Thanks!

from justdial-scrapper.

dvijparekh avatar dvijparekh commented on June 11, 2024

@Alankar0416 could you please post the code along with @ketanshah79 's changes?
Need to get justdial data for a college project.
Please guys, if either of you could do it, it'll be really helpful

Thanks!

@mps1305 check my forked repo i have made changes accordingly and its working just change url whichever you want

from justdial-scrapper.

mps1305 avatar mps1305 commented on June 11, 2024

hey @dvijparekh , it was working up until sometime back. then started getting this error. Any help in this regard would be highly appreciated!
"[WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond"

from justdial-scrapper.

dvijparekh avatar dvijparekh commented on June 11, 2024

it seems like justdial is blocking scraper to scrape working on it

from justdial-scrapper.

SuhailSaify avatar SuhailSaify commented on June 11, 2024

Hey, I have written a script that will scrape phone numbers from any JustDail Business page.
It uses the info in CSS stylesheet to create a mapping between the strings assigned to each number.
The mapping is done every time you load a page, therefore it works for every business.

Please try this:
https://github.com/SuhailSaify/Justdial-Scrapper

PS: it also scrapes other info along with Phone numbers.
(Working on July, 2019)

from justdial-scrapper.

abhi-ux avatar abhi-ux commented on June 11, 2024

can anyone update latest code here?

from justdial-scrapper.

abhi-ux avatar abhi-ux commented on June 11, 2024

from justdial-scrapper.

builditpossible-gs avatar builditpossible-gs commented on June 11, 2024

I am about to solve this issue, can anyone help me with this error - https://stackoverflow.com/questions/60875316/typeerror-string-indices-must-be-integers-when-getting-class-fro-span-tag-using

from justdial-scrapper.

dvijparekh avatar dvijparekh commented on June 11, 2024

I am about to solve this issue, can anyone help me with this error - https://stackoverflow.com/questions/60875316/typeerror-string-indices-must-be-integers-when-getting-class-fro-span-tag-using

please share link url of just dial you are trying to scrape

from justdial-scrapper.

builditpossible-gs avatar builditpossible-gs commented on June 11, 2024

I am about to solve this issue, can anyone help me with this error - https://stackoverflow.com/questions/60875316/typeerror-string-indices-must-be-integers-when-getting-class-fro-span-tag-using

please share link url of just dial you are trying to scrape

Solved it brother. Thank you.

from justdial-scrapper.

builditpossible-gs avatar builditpossible-gs commented on June 11, 2024

There is another error though
AttributeError: 'NoneType' object has no attribute 'text' on line return body.find('span', {'class':'mrehover'}).text.strip() in get_address

from justdial-scrapper.

dvijparekh avatar dvijparekh commented on June 11, 2024

There is another error though
AttributeError: 'NoneType' object has no attribute 'text' on line return body.find('span', {'class':'mrehover'}).text.strip() in get_address

it means it is not able to find span tag having class mrehover so body.find is returning none which doesnt have any method or attribute text()
try below code and let me know what are you getting from it

tesVar = body.find('span', {'class':'mrehover'})
print(`tesVar)

from justdial-scrapper.

alokm014 avatar alokm014 commented on June 11, 2024

Hey, use this method https://youtu.be/EkbF5JwuHqU

from justdial-scrapper.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.