yatish27 / linkedin-scraper Goto Github PK

View Code? Open in Web Editor NEW

551.0 551.0 220.0 226 KB

Scrapes the public profile of the linkedin page

License: MIT License

Ruby 100.00%

linkedin-scraper's People

Contributors

Stargazers

Watchers

Forkers

sagarjunnarkar dwe0008 mauriciomdea sreeharikmarar kimlima samrahmn jfdimark koteshwerrao thirdside23 mconstable vivekporwal gguerini vpoola88 brandic netconstructor big-data scraping-xx bigdata-tools hemantshekhawat mstajbakhsh angelmd wilcf dustyhorizon ivycheung parsing mapping geeosh stevenhallen carlotorniai stavrossk datawrangling imclab jeetjitsu olixier nguyenannguyen francesc jspenc72 chongeu kalpeshdave mmahalwy hammeiam clening mmatviyiv venkatasubbaiah devadigayatish flatiron-labs renjupaul rosshayes delphi-us jasonmannon kdmny altons zbadman hasantayyar ubuntuevangelist hiraolae vthummadi jayadevn alexgarciac jrhe chettr point-line-surface-body edgemont zalatnaicsongor buuji brandsrd encoreshao pliu007 johnnydwyer itoonnn jamesoram wyngit sharifulgeo rmtsukuru ryanleeallred hmphu ddesmarais58 sunilvrao shyammohankanojia omertu juito redsparklabs rowanhogan saj1919 tliber hilben umair-gujjar prabhpreet ronshoshani terry cernyjakub bossadvisors lucabongiorni dipsec fsakbas shemerey eugeneliang cuiyh980826 gwww aracktus

linkedin-scraper's Issues

languages && certifications empty response

Hi!

I just tested the library and is working great with a few minor glitches: json response returns empty values for languages and certifications (I tested on my account and i have both completed). In your code i saw this:

def languages
@languages ||= @page.search(".background-languages #languages ol li").map do |item|
language = item.at("h4").text rescue nil
proficiency = item.at("div.languages-proficiency").text.gsub(/\s+|\n/, " ").strip rescue nil
{ :language => language, :proficiency => proficiency }
end
end

def certifications
  @certifications ||= @page.search("background-certifications").map do |item|
    name       = item.at("h4").text.gsub(/\s+|\n/, " ").strip rescue nil
    authority  = item.at("h5").text.gsub(/\s+|\n/, " ").strip rescue nil
    license    = item.at(".specifics/.licence-number").text.gsub(/\s+|\n/, " ").strip rescue nil
    start_date = item.at(".certification-date").text.gsub(/\s+|\n/, " ").strip rescue nil

    { :name => name, :authority => authority, :license => license, :start_date => start_date }
  end
end

On the public profile there are no .background-languages and background-certifications classes. I use the following code in php with simpledom library and is working:

$education = $html->find('#education > li.school');
foreach ($education as $school) {
$school_name = $school->find('.item-title', 0)->innertext;
$title = $school->find('.item-subtitle', 0)->innertext;
$start_date = !empty($school->find('.date-range > time', 0)) ? date('Y', strtotime($school->find('.date-range > time', 0)->innertext)) : '';
$end_date = !empty($school->find('.date-range > time', 1)) ? date('Y', strtotime($school->find('.date-range > time', 1)->innertext)) : '';

        if (!empty($school_name) && !empty($title)) {
            $candidate_education[] =  $start_date . ' - ' . $end_date . ' ' . $school_name . ' - ' . $title . ' <br />';
        }
    }

    $certifications = $html->find('#certifications > li.certification');

    foreach ($certifications as $certification) {
        $name = $certification->find('h4.item-title > a', 0);

        if (!empty($name)) {
            $candidate_certifications[] = [
                'name' => $name->innertext,
                'url' => $name->href
            ];
        }
    }

Maybe this helps you.

Lib paths?

Hello there,
let me first thank you very much for a great piece of software!

I was having little trouble installing this both on Mac OS X and Ubuntu, error being pretty much the same.

kaisar@kaisar-MS-7641:~$ linkedin-scraper http://www.linkedin.com/in/jeffweiner08

/home/kaisar/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/site_ruby/2.0.0/rubygems/core_ext/kernel_require.rb:55:in require': cannot load such file -- ./lib/linkedin-scraper (LoadError) from /home/kaisar/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/site_ruby/2.0.0/rubygems/core_ext/kernel_require.rb:55:inrequire'
from /home/kaisar/.rvm/gems/ruby-2.0.0-p247/gems/linkedin-scraper-0.0.11/bin/linkedin-scraper:3:in <top (required)>' from /home/kaisar/.rvm/gems/ruby-2.0.0-p247/bin/linkedin-scraper:23:inload'
from /home/kaisar/.rvm/gems/ruby-2.0.0-p247/bin/linkedin-scraper:23:in <main>' from /home/kaisar/.rvm/gems/ruby-2.0.0-p247/bin/ruby_executable_hooks:15:ineval'
from /home/kaisar/.rvm/gems/ruby-2.0.0-p247/bin/ruby_executable_hooks:15:in `

Could you give me a pointer as to how I may solve this?

Thank you very much

Usage example does not appear to be working.

When walking through the usage instructions contained within the README, nearly all of the suggested methods were returning nil or an empty array.

I was having issues scraping LinkedIn using Nokogiri and Ruby's OpenURI module before trying out this gem. Perhaps LinkedIn is doing something to interfere with scraping attempts?

Tested using the following:

linkedin-scraper (0.1.2)
ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-darwin12.0]

Uninitialized Constant LinkedIn

When running the following in irb:

require 'linkedin-scraper'
profile = LinkedIn::Profile.get_profile("http://www.linkedin.com/in/robertwdempsey")

I get the following error:

NameError: uninitialized constant LinkedIn
from (irb):2
from /Users/roger/.rvm/rubies/ruby-1.9.3-p327/bin/irb:18:in `

I'm running ruby version 1.9.3p327 and have all the other required gems installed.

Any ideas?

Thanks!

Robert

NameError: uninitialized constant Linkedin in Rails

Hi!

Today I've tried to use this gem with Rails 4.2.3 and catched NameError: uninitialized constant Linkedin
The same one was described previously in issue #63
Then I've tried to change linkedin_scraper back to linkedin-scraper everywhere in project and it start working. I'm not pretty sure, but seems like gem name (from gemspec) different than main file name in lib directory make it broken when using with Rails.

You can try my PR #84

Unable to scrape locally hosted profileExample.html file

Hello All,

I had a project working a couple months ago, returned to it this weekend and ran into an issue. Hopefully someone can point me in the right direction, i'm at a loss. I did a fresh install of linkedin-scraper with the latest version.

In the past, i was able to save the source code from a profile, host it locally, then run "linkedin-scraper http:localhost:9999/jeffweiner08_local.html". This worked perfectly.

Now when I do this, it comes up with empty arrays (see below). When i point it back to the actual public profile (http://www.linkedin.com/in/jeffweiner08), everything works as expected.

Any ideas what i'm doing wrong? I'm currently on mac OSX, in the past i was running RHEL 7.

Example Result when using a local file:

########:~ user$ linkedin-scraper http://localhost:9999/jeffweiner08_local.html
{
  "name": "Jeff Weiner",
  "first_name": "Jeff",
  "last_name": "Weiner",
  "title": "CEO at LinkedIn",
  "location": "San Francisco Bay Area",
  "number_of_connections": "3",
  "country": "San Francisco Bay Area",
  "industry": null,
  "summary": null,
  "picture": "https://media.licdn.com/mpr/mpr/shrinknp_400_400/p/6/005/07c/31e/153cdd3.jpg",
  "projects": [

  ],
  "linkedin_url": "http://localhost:9999/Jeff_Weiner_local.html",
  "education": [

  ],
  "groups": [

  ],
  "websites": [

  ],
  "languages": [

  ],
  "skills": [

  ],
  "certifications": [

  ],
  "organizations": [

  ],
  "past_companies": [

  ],
  "current_companies": [

  ],
  "recommended_visitors": [

  ]
}

Legal

How do you cope with linkedin user agreement part 8.2?

8.2. Don'ts. You agree that you will not:
Scrape or copy profiles and information of others through any means (including crawlers, browser plugins and add-ons, and any other technology or manual work);

Having trouble calling skills

Great gem. Thank you.

I'm having difficult calling :skills and I was wondering if I was doing something wrong. From the console, the skills return as Mechanize page links, but when I return a JSON object the skills are nowhere to be found.

Here's the rabl view:

object @Profile => :profile
attributes :first_name, :last_name, :title, :location, :country, :industry, :current_companies, :past_companies, :websites, :groups, :skills, :education

And here's the app that return JSON:

require 'sinatra'
require 'rabl'
require 'linkedin-scraper'
require 'active_support/core_ext'
require 'active_support/inflector'
require 'builder'

Rabl.register!

get '/' do
"Hello Index"
end

get '/profile' do
@Profile = Linkedin::Profile.get_profile(params[:url])
render :rabl, :profile, format: 'json'
end

Still unable to use proxy IP

Anyway we can get a proxy variable dropped in to this script please , I know how to do it php , it is so easy when using CURL, I mean it must be simper in ruby to do this!

picture.nil? NoMethodError: undefined method `value' for nil:NilClass

some profiles have a problem with picture.
puts profile.picture
NoMethodError: undefined method `value' for nil:NilClass

need to type in login info somewhere

The error message I got is here:

C:/Ruby193/lib/ruby/gems/1.9.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent/ssl_reuse.rb:70:in connect': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (OpenSSL::SSL::SSLError) from C:/Ruby193/lib/ruby/gems/1.9.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent/ssl_reuse.rb:70:inblock in connect'
from C:/Ruby193/lib/ruby/1.9.1/timeout.rb:55:in timeout' from C:/Ruby193/lib/ruby/1.9.1/timeout.rb:100:intimeout'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent/ssl_reuse.rb:70:in connect' from C:/Ruby193/lib/ruby/1.9.1/net/http.rb:756:indo_start'
from C:/Ruby193/lib/ruby/1.9.1/net/http.rb:751:in start' from C:/Ruby193/lib/ruby/gems/1.9.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:700:instart'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:631:in connection_for' from C:/Ruby193/lib/ruby/gems/1.9.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:994:inrequest'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:259:in fetch' from C:/Ruby193/lib/ruby/gems/1.9.1/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:976:inresponse_redirect'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:300:in fetch' from C:/Ruby193/lib/ruby/gems/1.9.1/gems/mechanize-2.7.3/lib/mechanize.rb:440:inget'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/linkedin-scraper-0.1.3/lib/linkedin-scraper/profile.rb:20:in initialize' from C:/Ruby193/lib/ruby/gems/1.9.1/gems/linkedin-scraper-0.1.3/bin/linkedin-scraper:4:innew'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/linkedin-scraper-0.1.3/bin/linkedin-scraper:4:in <top (required)>' from C:/Ruby193/bin/linkedin-scraper:23:inload'
from C:/Ruby193/bin/linkedin-scraper:23:in `

Are there any configurations I have not done?

Return Status code from page

As scraping linkedin may fail for various reasons, you should probably return the page status. Something like this, but you may have a better idea.

@status = http_client.head( url ).code.to_i

Facebook Scraping issue

999 => for -- https://www.linkedin.com/in/some-profile

Hey, this issue is still not resolved: impossible to make even the first request, I fall directly on the 999 error whereas the

curl -A "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" -I --url

works perfectly well.

profile.rb:258:in `block in get_company_details': undefined method `[]' for nil:NilClass (NoMethodError)

I get this error when trying to get linkedin data

/usr/local/var/rbenv/versions/2.2.0/lib/ruby/gems/2.2.0/gems/linkedin-scraper-2.1.0/lib/linkedin-scraper/profile.rb:258:in `block in get_company_details': undefined method `[]' for nil:NilClass (NoMethodError)

LoadError

I installed linkedin-scraper and tried to include it in my script . I get the following load error .

LoadError: cannot load such file -- linkedin-scraper
from /home/abhi/.rbenv/versions/2.1.5/lib/ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in require' from /home/abhi/.rbenv/versions/2.1.5/lib/ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:inrequire'
from (irb):3
from /home/abhi/.rbenv/versions/2.1.5/bin/irb:11:in `
'

Groups gives an empty set back

Hi,

it seems like the groups section doesn't work anymore, it always return an empty set.
Tried corrected code to:

  @groups ||= @page.search('#groups .groups-name').map do |item|
    name = item.text.gsub(/\s+|\n/, ' ').strip if item.at('strong')

    { :name => name }
  end

Not working so far ..
thanks

Static tests

Need to test using fixtures for profile as well the company pages.

Company names are empty

The past_companies collection is populated but the company names are empty for all but the first position.

Profile used: https://www.linkedin.com/in/darrenbounds

Bad lib path

I installed linkedin-scraper on Lubuntu 13.10 by "gem install linkedin-scraper".
App "linkedin-scraper http://www.linkedin.com/in/jeffweiner08" crash with this error:

$ linkedin-scraper http://www.linkedin.com/in/jeffweiner08
/usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require': cannot load such file -- ./lib/linkedin-scraper (LoadError)
    from /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require'
    from /var/lib/gems/1.9.1/gems/linkedin-scraper-0.0.11/bin/linkedin-scraper:3:in `<top (required)>'
    from /usr/local/bin/linkedin-scraper:23:in `load'
    from /usr/local/bin/linkedin-scraper:23:in `<main>'

Error is in file "/var/lib/gems/1.9.1/gems/linkedin-scraper-0.0.11/bin/linkedin-scraper". Diff (one line fix):

--- /var/lib/gems/1.9.1/gems/linkedin-scraper-0.0.11/bin/linkedin-scraper_OLD   2013-10-20 19:31:04.675030346 +0200
+++ /var/lib/gems/1.9.1/gems/linkedin-scraper-0.0.11/bin/linkedin-scraper   2013-10-20 19:31:25.787013101 +0200
@@ -1,5 +1,5 @@
 #!/usr/bin/env ruby

-require './lib/linkedin-scraper'
+require '/var/lib/gems/1.9.1/gems/linkedin-scraper-0.0.11/lib/linkedin-scraper'
 profile = Linkedin::Profile.new(ARGV[0])
 puts profile.to_json```

after that output is ok:

pokus@pokus:$ linkedin-scraper http://www.linkedin.com/in/jeffweiner08
{"name":"Jeff Weiner","first_name":"Jeff","last_name":"Weiner","title":"CEO at LinkedIn","location":"Mounta:
...

able to extract number of connections from the profile?

Maybe add a small feature of extracting number of connections from the profile.

For example "https://www.linkedin.com/in/jeffweiner100", you can get 500+ connections at the top right corner of the name card.

Get number of endorsements of each skill?

First of all: Great gem!

Second, I would love to get the number of endorsement of each skill. Would that be possible to implement?

Scraper cannot scrape names somtimes

Sometimes when I scrape my own profile with url="https://www.linkedin.com/in/johnwu93", I am able to get my name by doing Linkedin::Profile.get_profile(url).name

However, sometimes, this will not work and Linkedin::Profile.get_profile(url) will actually return nil. I looked at the issues and it seems that the console will output a connection error and then the function will return nil. However, for my case, the console does not putout anything. Is there a way to fix this?

how to get the scraped data in an a file (instead of prompt)?

Hello,

I need to use the scraped data to enrich some RDF files. How can i get the scraped data in an a file (instead of prompt)?

A newbie has an Issue - Error: connection refused: www.linkedin.com:443

Hello everyone,
thanks for sharing this great gem!

I´m in the following situation:
Soon, I´ll start to write my master thesis. To create the necessary data base I want use the information of a few thousand public linkedin profiles.
I have never programed at all and currently try to find a way to create the data base.
In this context I found this gem. While using this gem, I came to the issue, that I get the described issue.
As soon as I use the following, I get the error below:
profile = Linkedin::Profile.get_profile("https://www.linkedin.com/in/xxx", {:proxy_ip=>'127.0.0.1',:proxy_port=>'3128', :username=>"xxx", :password=>'xxx'})

output: connection refused: www.linkedin.com:443
=> nil

Can you help me on that? What does it mean? How can I solve the issue?

I wrote before, I am a totally newbie and glad about any help I can get.
Do you have any general recommondation for my situation? Is this gem suitable for my issue?

Thanks alot

Returns nil in production but fine in development

Using Ruby 2.1.1 and Rails 4.0.3

Development, works great. Production, I get this

999 => for -- http://www.linkedin.com/in/mmahalwy

some public profiles have different format and do not parse

Some profiles are not able to be parsed, as they have slightly different CSS. For example, see http://www.linkedin.com/pub/carl-reichenbach/0/717/271 (random person). ".full-name" has no nested given-name or family-name elements, just a single text value:

<span class="n fn"><span class="full-name">Carl Reichenbach</span><span></span></span>

The profile image does not have id "#profile-picture", but rather, class ".profile-picture".

<div class="profile-picture"><a href="https://www.linkedin.com/reg/join-pprofile?_ed=0_01QnG4t0pWUEtxKrTXmsjRWJlD5Jt4ZMf87SacJnJbVIQMerUlnPlhlEJudlwN_Ln2SYzqPPSVnMDbi0r_G5A1&amp;trk=pprof-0-ts-view_full-0"> <img src="http://m.c.lnkd.licdn.com/media/p/4/000/16f/3c0/08a6c0e.jpg" alt="Carl Reichenbach"></a><span></span></div>

I don't know if this is a new format they're transitioning to or from, but if you could please update the code to look for these patterns in addition the existing ones (and any other differences found on this page, which I trust is representative), then we could successfully scrape them.

NameError: uninitialized constant Linkedin

Hello Yatish,

Thanks a lot for creating this great gem. And even more for taking the time to maintain it...

I used it in a previous app, and it worked fine. In this new one (rails), i'm facing a strange issue that I don't understand. When trying to get a profile's data (rails c / s), I get the following error:

2.2.0 :001 > u = "https://www.linkedin.com/in/nicolassarkozy"
 => "https://www.linkedin.com/in/nicolassarkozy" 
2.2.0 :002 > profile = Linkedin::Profile.get_profile u

NameError: uninitialized constant Linkedin
    from (irb):2
    from /Users/me/.rvm/gems/ruby-2.2.0/gems/railties-4.2.5/lib/rails/commands/console.rb:110:in `start'
    from /Users/me/.rvm/gems/ruby-2.2.0/gems/railties-4.2.5/lib/rails/commands/console.rb:9:in `start'
    from /Users/me/.rvm/gems/ruby-2.2.0/gems/railties-4.2.5/lib/rails/commands/commands_tasks.rb:68:in `console'
    from /Users/me/.rvm/gems/ruby-2.2.0/gems/railties-4.2.5/lib/rails/commands/commands_tasks.rb:39:in `run_command!'
    from /Users/me/.rvm/gems/ruby-2.2.0/gems/railties-4.2.5/lib/rails/commands.rb:17:in `<top (required)>'
    from /Users/me/code/lnkdn-xtrct/bin/rails:8:in `<top (required)>'
    from /Users/me/.rvm/rubies/ruby-2.2.0/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
    from /Users/me/.rvm/rubies/ruby-2.2.0/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
    from -e:1:in `<main>'

Would you have any idea how to solve this ?

Thanks a lot,
Brice

Proxy on command line tool

Some comments in other issues indicate that a second command line arg can be used to allow usage of a proxy with the command line tool. That doesn't seem to be the case - looking at linkedin-scraper's source, it doesn't seem to call the gem with {:proxy_ip=>'127.0.0.1',:proxy_port=>'3128'} as the second arg.

Is this a feature that is planned? I'm not a ruby dev but this seems like it's easy enough to add, any interest in a PR?

Install hangs at Building Native Extensions in OSX Mavericks

Hey guys:

Forgive me, I'm somewhat new to Ruby and creating gems. When I attempt to install scraper on OSX Mavericks, it seems to hang indefinitely at this stage. I've tried Verbose mode, but that only adds the following information:

Installing gem unf_ext-0.0.6
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/.document
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/.gitignore
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/Gemfile
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/LICENSE.txt
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/README.md
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/Rakefile
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/ext/unf_ext/extconf.rb
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/ext/unf_ext/unf.cc
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/ext/unf_ext/unf/normalizer.hh
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/ext/unf_ext/unf/table.hh
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/ext/unf_ext/unf/trie/char_stream.hh
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/ext/unf_ext/unf/trie/node.hh
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/ext/unf_ext/unf/trie/searcher.hh
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/ext/unf_ext/unf/util.hh
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/lib/unf_ext.rb
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/lib/unf_ext/version.rb
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/test/helper.rb
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/test/normalization-test.txt
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/test/test_unf_ext.rb
/Library/Ruby/Gems/2.0.0/gems/unf_ext-0.0.6/unf_ext.gemspec
Building native extensions. This could take a while...
/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/bin/ruby extconf.rb
checking for main() in -lstdc++...

I've updated the command line tools (based on this recommendation) http://stackoverflow.com/questions/23429145/error-failed-to-build-gem-native-extension-ruby-extconf-rb-mac-osx.

That doesn't seem to help. I can't seem to find the exact reason for the hang in Stack Overflow (or anywhere else). Any suggestions on next steps or how to fix this monster would be greatly appreciated.

SocketError: getaddrinfo: Name or service not known

Hi , when I use function current_positions and past_positions I always get this error :SocketError: getaddrinfo: Name or service not known
all other function on that profile work fine.
I do realize that it is making another call to get the page and it fails.
How do I fix that ?

this is The trace for the error :

from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/socksify-1.5.0/lib/socksify.rb:172:in initialize' from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/socksify-1.5.0/lib/socksify.rb:172:ininitialize'
from /home/chinmay/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/net/http.rb:879:in open' from /home/chinmay/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/net/http.rb:879:inblock in connect'
from /home/chinmay/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/timeout.rb:76:in timeout' from /home/chinmay/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/net/http.rb:878:inconnect'
from /home/chinmay/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/net/http.rb:863:in do_start' from /home/chinmay/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/net/http.rb:858:instart'
from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:700:in start' from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:965:inreset'
from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:628:in connection_for' from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:994:inrequest'
from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:257:in fetch' from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/mechanize-2.7.2/lib/mechanize.rb:432:inget'
from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/linkedin-scraper-0.1.1/lib/linkedin-scraper/profile.rb:180:in get_company_details' from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/linkedin-scraper-0.1.1/lib/linkedin-scraper/profile.rb:166:inblock in get_companies'
from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/nokogiri-1.6.1/lib/nokogiri/xml/node_set.rb:237:in block in each' from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/nokogiri-1.6.1/lib/nokogiri/xml/node_set.rb:236:inupto'
from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/nokogiri-1.6.1/lib/nokogiri/xml/node_set.rb:236:in each' from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/linkedin-scraper-0.1.1/lib/linkedin-scraper/profile.rb:151:inget_companies'
from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/linkedin-scraper-0.1.1/lib/linkedin-scraper/profile.rb:69:in current_companies' from (irb):7 from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands/console.rb:90:instart'
from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands/console.rb:9:in start' from /home/chinmay/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands.rb:62:in<top (required)>'
from bin/rails:4:in `require'

Thanks

Problem when linkedIn URL has accents

Is there another way I can write the URL?
I tried URLEncoding, etc.

linkedin-scraper http://cl.linkedin.com/pub/alejandra-peña-lazcano/61/9ab/68/en

/usr/local/bin/linkedin-scraper:13:in `

': invalid byte sequence in UTF-8 (ArgumentError)

v1 current_companies sometimes returns nil? false with nothing.

education description not working

This always appears as empty string.

Add version flags to the command line executable

Attempting to check the version number of them gem through the command line executable instead throws an error as a query is made.

Adding both -v and --version flags would be helpful! Ruby's OptionParser class is a fairly common method for building these flags into a gem's executable (see Nokogiri's binary file for an example).

Does linked scraper follow links?

I'm using linked scraper in a web configuration. I ran after 4 'scrapes' into 999 (blocked by linkedIn). I made a workaround using WGET first, storing the profile and serve this on my own server. After this I http://localhost/profile.html with linkedin-scraper. This works like a charm... however after a number of 'scrapes' I run into a 999 blocked again?? I'm not a ruby programmer and have a hard time following the code.
My question is simple, is linkedin-scraper following links in the profile, and therefore still accessing linkedin.com? And if so.. does it really need to? I noticed that opening the stored profile on my server as a webpage loads the linkedin page again (there is a redirect in the file).

Hope you can shed some light on this.

"See less" included in skills

The link to "See less" is not being filtered out of the skills.

Wrong country assigned for US profile

For the US profile, Country DOM has the state or area. Need to fix that

Getting it to Work for a Company Page

This has been working really great for me so far for individual public profiles, thank you. I'm really interested, though, in scraping a company page. Here's an example of one i'm interested in:

https://www.linkedin.com/company/nestle-s-a-?trk=affco

These pages have a lot of unique data and we have a lot of potential accounts that we'd like to know the industry for, and/or number of employees. I understand that searching by individual yields company data. But we have a lot of company names for which we don't yet have contacts.

I'm interested in creating a version that could scrape company pages. I would be happy to create a pull and build it myself, and I will attempt to do so. Unfortunately I'm fairly new to programming and I'm very new to Ruby. So if you would be willing to give me a head start by telling me what files I should be modifying, or just help me build it, or (if you really feel like it) build it yourself, I would really appreciate it. Thanks so much!

Scraper not working

The scraper fails on profile picture:

/usr/local/share/gems/gems/linkedin-scraper-0.1.7/lib/linkedin_scraper/profile.rb:75:in picture': undefined methodvalue' for nil:NilClass (NoMethodError)
from /usr/local/share/gems/gems/linkedin-scraper-0.1.7/lib/linkedin_scraper/profile.rb:177:in block in to_json' from /usr/local/share/gems/gems/linkedin-scraper-0.1.7/lib/linkedin_scraper/profile.rb:177:ineach'
from /usr/local/share/gems/gems/linkedin-scraper-0.1.7/lib/linkedin_scraper/profile.rb:177:in reduce' from /usr/local/share/gems/gems/linkedin-scraper-0.1.7/lib/linkedin_scraper/profile.rb:177:into_json'
from /usr/local/share/gems/gems/linkedin-scraper-0.1.7/bin/linkedin-scraper:5:in <top (required)>' from /usr/local/bin/linkedin-scraper:23:inload'
from /usr/local/bin/linkedin-scraper:23:in `

Maybe it is better to catch the exception and return a partial result

999 response error while running from racksapce

HI I using the scraper form my Rackspece machine I receive a 999 response.
Details of the error below:
linkedin-scraper http://www.linkedin.com/in/jeffweiner08
/usr/local/rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:933:in response_read': 999 => for -- http://www.linkedin.com/in/jeffweiner08 (Mechanize::ResponseCodeError) from /usr/local/rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:262:inblock in fetch'
from /usr/local/rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:1413:in block (2 levels) in transport_request' from /usr/local/rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http/response.rb:162:inreading_body'
from /usr/local/rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:1412:in block in transport_request' from /usr/local/rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:1403:incatch'
from /usr/local/rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:1403:in transport_request' from /usr/local/rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/http.rb:1376:inrequest'
from /usr/local/rvm/gems/ruby-2.0.0-p247/gems/net-http-persistent-2.9/lib/net/http/persistent.rb:986:in request' from /usr/local/rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:259:infetch'
from /usr/local/rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.3/lib/mechanize.rb:440:in get' from /usr/local/rvm/gems/ruby-2.0.0-p247/gems/linkedin-scraper-0.1.3/lib/linkedin-scraper/profile.rb:20:ininitialize'
from /usr/local/rvm/gems/ruby-2.0.0-p247/gems/linkedin-scraper-0.1.3/bin/linkedin-scraper:4:in new' from /usr/local/rvm/gems/ruby-2.0.0-p247/gems/linkedin-scraper-0.1.3/bin/linkedin-scraper:4:in<top (required)>'
from /usr/local/rvm/gems/ruby-2.0.0-p247/bin/linkedin-scraper:23:in load' from /usr/local/rvm/gems/ruby-2.0.0-p247/bin/linkedin-scraper:23:in

Should not request to company detail if there is no link

It should not send request to company detail link if there is no link, https://github.com/yatish27/linkedin-scraper/blob/master/lib/linkedin-scraper/profile.rb#L199. Some companies doesn't have link, https://www.linkedin.com/in/antoniofelaco.

if company_link
  result = get_company_details(company_link)
else
  result = {}
end

I don't have enough time to send PR. :)

Gem::Ext::BuildError: ERROR: Failed to build gem native extension.

Great gem, exactly what I have been looking for.
I get the following error when trying to install the gem through the command
gem install linkedin-scraper -v 1.0.4, gem install linkedin-scraper or in my gemfile with gem 'linkedin-scraper'
Seems like it has a problem on the dependency of unf_ext

`Installing unf_ext 0.0.7.2 with native extensions

Gem::Ext::BuildError: ERROR: Failed to build gem native extension.

/home/jan/.rbenv/versions/2.2.3/bin/ruby -r ./siteconf20160324-3908-iq7116.rb extconf.rb
checking for main() in -lstdc++... no
creating Makefile

make "DESTDIR=" clean

make "DESTDIR="
compiling unf.cc
make: g++: Command not found
make: *** [unf.o] Error 127

make failed, exit code 2

Gem files will remain installed in /home/jan/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/unf_ext-0.0.7.2 for inspection.
Results logged to /home/jan/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/extensions/x86-linux/2.2.0-static/unf_ext-0.0.7.2/gem_make.out
An error occurred while installing unf_ext (0.0.7.2), and Bundler cannot continue.
Make sure that gem install unf_ext -v '0.0.7.2' succeeds before bundling.`

Any idea what the pronlem could be?
Thank you

[RFC] Need for connections information

It would be great if we can use this scraper in a way that you can also get the information about the connections that a person has.... This would help build a better profile about a "scraped profile". This would require logging into the Linkedin directly from the code or based upon a cookie that is provided to the scraper.

Mechanize::ResponseCodeError: 999 => for -- https://www.linkedin.com/in/jeffweiner08

linkedin-scraper (2.1.1)

irb(main):001:0> require 'linkedin-scraper'
=> true
irb(main):002:0> profile = Linkedin::Profile.new("http://www.linkedin.com/in/jeffweiner08")
Mechanize::ResponseCodeError: 999 => for -- https://www.linkedin.com/in/jeffweiner08

Class name clash with 'browser' gem

The new dependency on random_user_agent gem has added a nasty classname clash with widely used browser gem. Both gems define a Browser class without any namespacing.

I am afraid I am not the only one who was hit by this - it seems to me the browser gem is widely used.

Scraper is conflating start and end date at companies

Hi, I love the gem, but I am having some issues parsing the dates in Linkedin::Profile#past_companies

It looks like the hashes returned in #current_companies give start_date and end_date keys, while #past_companies only have start_date. Example output:

Actual LinkedIn:

Thanks

[RFC] direct database store

It would be great if the information can directly be send to a table in a database. This would help the adoption of the code I think. People whould be able to use the scraped data directly from the database as soon as it is commited.

API not working any more

hi,

I had the code installed on my own laptop a month ago, and everything worked fine.
Now, i just created a new instance on EC2, and installed/pulled the code, but getting:

ubuntu@ip-172-31-42-56:~$ linkedin-scraper http://www.linkedin.com/in/jeffweiner08
/home/ubuntu/.rvm/gems/ruby-2.2.1/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:933:in response_read': 999 => for -- http://www.linkedin.com/in/jeffweiner08 (Mechanize::ResponseCodeError) from /home/ubuntu/.rvm/gems/ruby-2.2.1/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:262:inblock in fetch'

This is the first call i am ever making, so it cannot be already blocked from Linkedin !! Do you know what could be happening ?
Thanks!
Matt

doesn't scrape company details anymore

The company details are not returned anymore for me. Same for you ?

profile = Linkedin::Profile.get_profile("http://www.linkedin.com/in/jeffweiner08")
website = profile.current_companies.first[:website]
# nil