gocolly / colly Goto Github PK
View Code? Open in Web Editor NEWElegant Scraper and Crawler Framework for Golang
Home Page: https://go-colly.org/
License: Apache License 2.0
Elegant Scraper and Crawler Framework for Golang
Home Page: https://go-colly.org/
License: Apache License 2.0
ID
is part of the list of common initialisms used by the golint tool and thus we get warnings like this:
struct field Id should be ID
func parameter requestId should be requestID
func parameter collectorId should be collectorID
Unfortunately fixing this problem would change the exported types Collector
and Request
. Would a change like this even be considered?
Example code:
func main() {
c := colly.NewCollector()
// Find and visit all links
c.OnHTML("a", func(e *colly.HTMLElement) {
link := e.Attr("href")
fmt.Println(link)
c.Visit(e.Request.AbsoluteURL(link))
})
c.Visit("https://en.wikipedia.org/")
}
Execution log:
==================
WARNING: DATA RACE
Write at 0x00c420083950 by main goroutine:
runtime.mapassign_faststr()
/usr/local/go/src/runtime/hashmap_fast.go:598 +0x0
net/textproto.MIMEHeader.Set()
/usr/local/go/src/net/textproto/header.go:22 +0x60
net/http.Header.Set()
/usr/local/go/src/net/http/header.go:31 +0x60
net/http.(*Request).AddCookie()
/usr/local/go/src/net/http/request.go:385 +0x37c
net/http.(*Client).send()
/usr/local/go/src/net/http/client.go:170 +0x115
net/http.(*Client).Do()
/usr/local/go/src/net/http/client.go:602 +0x513
crawler/vendor/github.com/asciimoo/colly.(*httpBackend).Do()
/home/skruglov/Projects/go/src/crawler/vendor/github.com/asciimoo/colly/http_backend.go:154 +0x105
crawler/vendor/github.com/asciimoo/colly.(*httpBackend).Cache()
/home/skruglov/Projects/go/src/crawler/vendor/github.com/asciimoo/colly/http_backend.go:110 +0x9e
crawler/vendor/github.com/asciimoo/colly.(*Collector).scrape()
/home/skruglov/Projects/go/src/crawler/vendor/github.com/asciimoo/colly/colly.go:226 +0x461
crawler/vendor/github.com/asciimoo/colly.(*Collector).Visit()
/home/skruglov/Projects/go/src/crawler/vendor/github.com/asciimoo/colly/colly.go:157 +0x9b
main.main()
/home/skruglov/Projects/go/src/crawler/main.go:19 +0xe4
Previous read at 0x00c420083950 by goroutine 24:
runtime.mapaccess1_faststr()
/usr/local/go/src/runtime/hashmap_fast.go:208 +0x0
net/http.http2isConnectionCloseRequest()
/usr/local/go/src/net/http/h2_bundle.go:8652 +0xae
net/http.(*http2clientConnReadLoop).endStreamError()
/usr/local/go/src/net/http/h2_bundle.go:8288 +0xe2
net/http.(*http2clientConnReadLoop).endStream()
/usr/local/go/src/net/http/h2_bundle.go:8277 +0x54
net/http.(*http2clientConnReadLoop).processData()
/usr/local/go/src/net/http/h2_bundle.go:8267 +0x1ce
net/http.(*http2clientConnReadLoop).run()
/usr/local/go/src/net/http/h2_bundle.go:7896 +0x737
net/http.(*http2ClientConn).readLoop()
/usr/local/go/src/net/http/h2_bundle.go:7788 +0x11c
Goroutine 24 (running) created at:
net/http.(*http2Transport).newClientConn()
/usr/local/go/src/net/http/h2_bundle.go:7053 +0xe1a
net/http.(*http2Transport).NewClientConn()
/usr/local/go/src/net/http/h2_bundle.go:6991 +0x55
net/http.(*http2addConnCall).run()
/usr/local/go/src/net/http/h2_bundle.go:835 +0x55
==================
#mw-head
#p-search
/wiki/Wikipedia
...
I just tested colly, and loved how fast it performed. I have a question about encodings. In the docs it says that it has automatic encoding of non unicode responses. Can this be customized? I tried to grab content from https://www.nsd.ru/ru/db/news/ndcpress/ , which is windows-1251
encoded, and the contents came back unreadable, so I was wondering how can I set up colly to grab content specifying the encoding.
Hi,
Doesn't get jobs response seems login not successful. Not sure what I missed. Please share me the right way to do it. Thanks.
package main
import (
"fmt"
"log"
"strings"
"github.com/gocolly/colly"
)
func main() {
c := colly.NewCollector()
err := c.Post("https://www.linkedin.com/uas/login-submit", map[string]string{"session_key": "EMAIL", "session_password": "PASSWORD"})
if err != nil {
log.Fatal(err)
}
c.AllowedDomains = []string{"www.linkedin.com"}
// attach callbacks after login
c.OnResponse(func(r *colly.Response) {
log.Println("response received", r.StatusCode)
})
c.OnError(func(_ *colly.Response, err error) {
log.Println("Something went wrong:", err)
})
c.OnHTML("a[href]", func(e *colly.HTMLElement) {
fmt.Println("element:", e)
if strings.Contains(e.Attr("href"), "/jobs/view") {
fmt.Println("replaced:", strings.Replace(e.Attr("href"), "https://www.linkedin.com/", "", -1))
e.Request.Visit(e.Attr("href"))
}
})
// start scraping
c.Visit("https://www.linkedin.com/jobs/")
}
Do you have any plan to support proxy and encode/decode?
Probably should break out OnRequestError/OnResponseError, but adding a basic OnError that receives the request, response, and error seems to make sense as a first pass.
Pull request on its way shortly
Does it handle spa sites where JavaScript gens the site client side ?
When there is a page redirect, colly automatically follows the redirect. In that case, I get a Request object in the OnHTML callback. It seems that colly provides the original Request and not the one after the redirect. Since I want to follow all the links on the html site, I use the Request object to get the absolute URL. However, in that case this doesn't work as expected, since the Request Object has the wrong URL. The example below illustrates the problem:
package main
import (
"fmt"
"net/http"
"time"
"github.com/gocolly/colly"
)
func main() {
go func() {
http.Handle("/", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.Redirect(w, r, "/r/", http.StatusSeeOther)
}))
http.Handle("/r/", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, `<a href="test">test</a>`)
}))
http.ListenAndServe("127.0.0.1:9999", nil)
}()
time.Sleep(500 * time.Millisecond)
c := colly.NewCollector()
c.AllowedDomains = []string{"127.0.0.1:9999"}
c.OnHTML("a[href]", func(e *colly.HTMLElement) {
fmt.Println(e.Request.AbsoluteURL(e.Attr("href")))
})
c.Visit("http://127.0.0.1:9999/")
c.Wait()
time.Sleep(1000 * time.Hour)
}
The example gives "http://127.0.0.1:9999/test". However when I go to "http://127.0.0.1" via firefox and click on the link, I get redirected to "http://127.0.0.1:9999/r/test".
Is there a better way to mimic the behavior of the browser in this case?
Are you planning to support the ability to submit forms, keep the cookies while visiting a different page, and some basic user interaction actions?
Thanks
Hey mate!
I'm loving colly so far. I'm new to the Go programming language and I've just been messing around with your scraping library and found a weird bug.
I was just testing out scraping my website, and then allowing the scraper to scrape Medium. I end up with this error:
(I'm using Go v 1.9 on Linux x86).
This is the code:
package main
import (
"fmt"
"github.com/asciimoo/colly"
)
func main() {
scraper := colly.NewCollector()
scraper.AllowedDomains = []string{"onslow.io", "medium.com"}
scraper.OnHTML("a[href]", func(element *colly.HTMLElement) {
link := element.Attr("href")
// Print link
fmt.Printf("Link found: %q -> %s\n", element.Text, link)
// Visit link found on page
// Only those links are visited which are in AllowedDomains
go scraper.Visit(element.Request.AbsoluteURL(link))
})
scraper.OnError(func(request *colly.Response, err error) {
fmt.Println("Request URL:", request.Request.URL, "failed with response:", request, "\nError:", err)
})
scraper.OnRequest(func(request *colly.Request) {
fmt.Println("Visiting", request.URL.String())
})
scraper.Visit("http://onslow.io")
scraper.Wait()
}
From what I've gathered, it has to do with the Goroutines possibly not syncing properly?
If you have any other ideas on the cause of this, it'd be great to hear them!
Cheers
Since I need to visit an unsafe website over https, the Post / Get method will return an error: x509: certificate has expired or is not yet valid.
I know I can use InsecureSkipVerify: true
when start a request using net / http package, but what should I do if I want to skip SSL certificate check in Colly?
Currently there is code in the tests manually running a HTTP server to run test against, this is what the net/http/httptest package is meant to be used for.
Hello,
https://github.com/asciimoo/colly/blob/7a13f4120d95f515c82f3e79b204ab96aab12156/colly.go#L123
Is the limit for process performance?
package main
import (
"github.com/gocolly/colly"
"github.com/gocolly/colly/debug"
"time"
)
func main() {
urls := []string{"https://weibo.cn/repost/FBrYpiw8h?uid=1153760245&rl=1", "https://weibo.cn/repost/FBrXSqrIl?uid=2137005731&rl=1", "https://weibo.cn/repost/FBrXOlMmQ?uid=5131689041&rl=1", "https://weibo.cn/repost/FBrXJBCQs?uid=1701023441&rl=1", "https://weibo.cn/repost/FBrXg4ZuX?uid=5999431007&rl=1", "https://weibo.cn/repost/FBrXcuadg?uid=5819066338&rl=1", "https://weibo.cn/repost/FBrWEgEor?uid=3517902151&rl=1","https://weibo.cn/repost/FBrWmuTYh?uid=2974402113&rl=1", "https://weibo.cn/repost/FBrVZtT1p?uid=5533885122&rl=1",
"https://weibo.cn/repost/FBrVrqA5T?uid=1613781965&rl=1", "https://weibo.cn/repost/FBrYpiw8h?uid=1153760245&rl=1", "https://weibo.cn/repost/FBrXSqrIl?uid=2137005731&rl=1", "https://weibo.cn/repost/FBrXOlMmQ?uid=5131689041&rl=1", "https://weibo.cn/repost/FBrXJBCQs?uid=1701023441&rl=1", "https://weibo.cn/repost/FBrXg4ZuX?uid=5999431007&rl=1", "https://weibo.cn/repost/FBrXcuadg?uid=5819066338&rl=1", "https://weibo.cn/repost/FBrWEgEor?uid=3517902151&rl=1", "https://weibo.cn/repost/FBrWmuTYh?uid=2974402113&rl=1",
"https://weibo.cn/repost/FBrVZtT1p?uid=5533885122&rl=1", "https://weibo.cn/repost/FBrVrqA5T?uid=1613781965&rl=1", "https://weibo.cn/repost/FBrUXncEG?uid=5046939400&rl=1"}
// Instantiate default collector
c := colly.NewCollector(
// Turn on asynchronous requests
colly.Async(true),
// Attach a debugger to the collector
colly.Debugger(&debug.LogDebugger{}),
)
c.SetRequestTimeout(2*time.Second)
// Limit the number of threads started by colly to two
// when visiting links which domains' matches "*httpbin.*" glob
// Start scraping in five threads on https://httpbin.org/delay/2
for _,v := range urls{
c.Visit(v)
}
c.Wait()
}
[Not really an issue]
Hey mate
I've been using Colly for a small scraping project and I've come across a weird bit of behaviour.
The e.ChildText()
function returns the text in all of the children as one string. However, using e.ChildAttr()
only returns the first match. I read through the code in colly.go
and understand this is the intended behaviour, but I was wondering why you wouldn't want to return all child attributes?
Loving this package though, it's been a lot of fun to use. Thank you for keeping it up to date!
Cheers
Any plan to add JavaScript engines to this framework. A few projects that might help are
https://github.com/robertkrimen/otto
https://github.com/lazytiger/go-v8
https://github.com/dop251/goja
I'm not sure if this aligns with your goals for the project, but it could be almost necessary on alot of modern websites.
Hello,
What is ssh support?
Thanks.!
Is there a way to do request retry inside colly? Right now I can only resend the request in onError
Implementing functional options for the NewCollector
constructor function would allow the user to setup the collector without manually setting field values on *Collector
This would be a non breaking change since the options would be a variadic argument to
NewCollector
Examples of options could be to add a domain to the AllowedDomains
field: NewCollector(AllowDomain("example.com"))
What do you think?
package main
import (
"fmt"
"github.com/gocolly/colly"
)
func main() {
c := colly.NewCollector()
// Find and visit all links
c.OnHTML("a[href]", func(e *colly.HTMLElement) {
link := e.Attr("href")
// Print link
fmt.Printf("Link found: %q -> %s\n", e.Text, link)
})
c.OnRequest(func(r *colly.Request) {
fmt.Println("Visiting", r.Headers)
})
c.OnResponse(func(r *colly.Response) {
fmt.Println("Visited", r.Headers)
})
c.Visit("https://weibo.cn/repost/FByvKgel6?uid=6049100503&rl=1")
}
Visiting &map[User-Agent:[colly - https://github.com/gocolly/colly]]
Visited &map[Content-Type:[text/html] Connection:[keep-alive] Vary:[Accept-Encoding] Expires:[Sat, 26 Jul 1997 05:00:00 GMT] Dpool_header:[luna139] Pragma:[no-cache] Sina-Lb:[aGEuMjAyLmcxLnloZy5sYi5zaW5hbm9kZS5jb20=] Server:[nginx/1.6.1] Date:[Thu, 28 Dec 2017 13:42:51 GMT] Cache-Control:[no-cache, must-revalidate] Sina-Ts:[N2FiMjljY2UgMCAxIDEgMiA2Cg==]]
Link found: "\xb9ر\xd5" -> javascript:history.go(-1);
Link found: "" -> javascript:;
Link found: "\xbb\xbbһ\xd5\xc5" -> javascript:;
Link found: "\xb5\xc7¼" -> javascript:;
Link found: "\xb5\xda\xc8\xfd\xb7\xbd\xd5ʺ\xc5" -> https://passport.weibo.cn/signin/other?r=http%3A%2F%2Fweibo.cn
Link found: "ע\xb2\xe1\xd5ʺ\xc5" -> http://m.weibo.cn/reg/index?&vt=4&wm=3349&wentry=&backURL=http%3A%2F%2Fweibo.cn
Link found: "\xcd\xfc\xbc\xc7\xc3\xdc\xc2\xeb" -> https://passport.weibo.cn/forgot/forgot?entry=wapsso&from=0
Link found: "ȡ\xcf\xfb" -> javascript:;
Link found: "\xd1\xe9֤\xc2\xeb\xb5\xc7¼" -> javascript:;
Link found: "\xb9ر\xd5" -> javascript:history.go(-1);
Link found: "ȷ\xc8\xcf" -> javascript:;
Link found: "ʹ\xd3\xc3\xc6\xe4\xcb\xfb\xd5ʺŵ\xc7¼" -> javascript:;
package main
import (
"github.com/asciimoo/colly"
)
func main() {
c := colly.NewCollector()
c.Visit("https://www.google.com")
}
If build with 386 architecture, it crashes:
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x0 pc=0x4012bc]
goroutine 1 [running]:
sync/atomic.AddUint64(0x1284e304, 0x1, 0x0, 0x128c74c0, 0x0)
d:/soft/go/src/sync/atomic/asm_386.s:112 +0xc
github.com/asciimoo/colly.(*Collector).scrape(0x1284e280, 0x73acb0, 0x16, 0x72f0c5, 0x3, 0x1, 0x0, 0x0, 0x128212f8, 0x0, ...)
d:/go/src/github.com/asciimoo/colly/colly.go:244 +0x244
github.com/asciimoo/colly.(*Collector).Visit(0x1284e280, 0x73acb0, 0x16, 0x69320b, 0x693524)
d:/go/src/github.com/asciimoo/colly/colly.go:175 +0x6b
main.main()
d:/go/src/playground/main.go:9 +0x31
From https://golang.org/pkg/sync/atomic/
On both ARM and x86-32, it is the caller's responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.
package main
import (
"fmt"
"github.com/gocolly/colly"
"github.com/gocolly/colly/debug"
"time"
)
func main() {
c := colly.NewCollector(
colly.UserAgent("Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36"),
colly.AllowedDomains("irby.kz"),
colly.Async(true),
colly.Debugger(&debug.LogDebugger{}),
)
c.DisableCookies()
c.Limit(&colly.LimitRule{
Parallelism: 2,
Delay: 1 * time.Second,
})
c.OnHTML("a[href]", func(e *colly.HTMLElement) {
link := e.Attr("href")
fmt.Printf("Link found: %q -> %s\n", e.Text, link)
c.Visit(e.Request.AbsoluteURL(link))
})
// Set error handler
c.OnError(func(r *colly.Response, err error) {
fmt.Println("Request URL:", r.Request.URL, "failed with response:", r, "\nError:", err)
})
// Set HTML callback
// Won't be called if error occurs
c.OnHTML("*", func(e *colly.HTMLElement) {
fmt.Println(e)
})
c.OnRequest(func(r *colly.Request) {
fmt.Println("Visiting", r.URL.String())
})
c.Visit("http://irby.kz/ru/catalog/dlya_devochek/?SHOW_ALL=Y")
c.Visit("http://irby.kz/ru/catalog/dlya_malchikov_1/?SHOW_ALL=Y")
c.Wait()
}
Do I have to initialize a new collector? It doesn't work after this.
Do you have any plan to support a multipart POST request?
I will code this feature later.
Currently you can specify a URLFIlter to include URL, is there any way to exclude urls ?
Does it make sense to support abiding by the website robots.txt
?
Current visited url just use the url to detect same request, collector should accept a custom unique request map(e.g. func(r *Request)string), because post request use id in form to do this, not just url.
BTW, If you accept prs, I can contribute some work.
I want to put a storage in a context, but the context only accept string.
The purpose is to filter out the mismatched urls
for example
package main
import (
"github.com/asciimoo/colly"
"fmt"
"time"
)
func main() {
urls := []string{"https://httpbin.org/hello", "https://httpbin.org/123", "https://httpbin.org/12xyz"}
// Instantiate default collector
c := colly.NewCollector()
// when visiting links which domains' matches "*httpbin.*" glob
c.Limit(&colly.LimitRule{
DomainGlob: "*https://httpbin.org/[a-z]+",
})
c.OnResponse(func(r *colly.Response) {
fmt.Println("Finished", r.Request.URL, time.Now())
})
for _, i := range urls {
c.Visit(i)
}
c.Wait()
}
Starting https://httpbin.org/hello",
Starting "https://httpbin.org/123",
Starting "https://httpbin.org/12xyz
Finished https://httpbin.org/hello
For batch craw, the memory will grow very fast, colly become very slow
I'm doing some timing during a crawl recording the start time in Ctx in OnRequest and calculating a duration in OnResponse. It all works very well until I try to throttle the crawler with a Limit with Delay since the sleep is called by the backend after the OnRequest callbacks
Is there another callback that can be used, or would you consider moving the sleep to before the OnRequest is called ?
Is it possible to create randomized delays, i.e., per-request delays selected from some range or based on some random factor? I couldn't think of a good way to do this, other than maybe cycling through several Collector
s with different limit sets, which seems sub-optimal.
It seems like having LimitRule.DelayRange
or LimitRule.RandomFactor
options would be quite helpful.
I found
rp, err := proxy.RoundRobinProxySwitcher("socks5://127.0.0.1:1337", "socks5://127.0.0.1:1338")
if err != nil {
log.Fatal(err)
}
c.SetProxyFunc(rp)
but if i have ten urls need request, this way is all request to use proxy, i only want one request to use the proxy, What should I do?
When I use colly, I have a case to to iterate context elements when I put something in it with multiple OnHTML
callback on different html elements.
This is the simple function I wrote.
// ForEach iterate context
func (c *Context) ForEach(fn func(k string, v interface{}) interface{}) []interface{} {
c.lock.RLock()
defer c.lock.RUnlock()
ret := make([]interface{}, 0, len(c.contextMap))
for k, v := range c.contextMap {
cur := fn(k, v)
ret = append(ret, cur)
}
return ret
}
Hope this can help someone when they also need to iterate context.
There are many subdomains for some sites, it's convenient to set the AllowedDomains by match, like this
// c.AllowedDomains = []string{"hackerspaces.org", "wiki.hackerspaces.org"}
c.AllowedDomains = []string{"*hackerspaces.org"}
There are five main callbacks in colly and they are:
We want to show reader these callbacks at the very beginning, so how about we extend the basic example with all those callbacks?
any example? thanks!:-)
I was wondering how it's possible to pass context between collectors.
My use case is that I have a collector that is collecting links and the triggers another collector to visit this link. On the second collector, I'd like to pass some sort of context (like a parent category name that is not present on the child page HTML) but I didn't find a way to achieve this because Context
seems only for request/responses within the same collector.
Currently, you can put anything in the context. Which was allowed after the fix for this issue.
But it will type assert any value to a string nevertheless.
I propose to allow to retrieve an interface instead. Which would be a breaking change
Great project here, would love to be able to pass a custom http.Transport for the backend's http.Client to use.
Why?:
Will be submitting a PR shortly
Hi,
It seems that sync.WaitGroup
is not used correctly in func (c *Collector) scrape(...) error
method.
// colly.go:307
func (c *Collector) scrape(...) error {
c.wg.Add(1)
defer c.wg.Done()
...
Consider the following example (similar to http://go-colly.org/docs/examples/rate_limit/):
for _, url := range urls {
go c.Visit(url)
}
c.Wait()
Here we call Visit
(scrape
wrapper) in a separate goroutine for each url and wait for their completion.
The problem is that it's not guaranteed that all goroutines will be finished after c.Wait()
returns since "goroutine to wait" count is incremented inside each new goroutine - c.wg.Add(1)
.
So, it's up to scheduler if c.wg.Add(1)
will be called before or after c.Wait()
.
I think there are two ways how that issue could be fixed:
sync.WaitGroup
. Just provide a pointer to it as a param.scrape
and it's wrappers.Could look like that:
func (c *Collector) scrapeAsync(...) chan<- error {
errChan := make(chan error)
c.wg.Add(1)
go func() {
defer c.wg.Done()
errChan <- c.scrape(...)
}()
return errChan
}
I can do a PR if you want.
On Mac with go version 1.9
Fils:collyIndexer dfils$ go get github.com/asciimoo/colly
# github.com/asciimoo/colly
../../../../github.com/asciimoo/colly/colly.go:302:16: cannot use n.Attr (type []"code.google.com/p/go.net/html".Attribute) as type []"golang.org/x/net/html".Attribute in field value
Fils:collyIndexer dfils$ go version
go version go1.9 darwin/amd64
What are your thoughts on adding proxy support to the Collector
? I see that one could just create a custom Transport
and set the collector to use it, but it would be nice to have a SetProxy(url)
method or something similar.
Hello @asciimoo , good job with the lib!
I just wanted to say that I did open a PR 5 days ago to add your project inside the "awesome-go" directory.
The README doesn't contain any quality references, like report card, so I took the liberty to complete them inside the PR text, please add them as badges to the README.md of this repository as well, the links are:
However in order (the PR) to be acceptable we have to complete some other links as well, like the coverage service link, it looks like this: https://cover.run/go/github.com/asciimoo/colly.svg
but I couldn't find any tests (_test.go
files) of "colly" so the PR was marked as "pending".
Please keep watching this and do your bests to complete some test files, those details matters there.
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.