Coder Social home page Coder Social logo

bramblexu / pydata-notebook Goto Github PK

View Code? Open in Web Editor NEW
4.6K 245.0 1.7K 43.87 MB

利用Python进行数据分析 第二版 (2017) 中文翻译笔记

Python 0.03% Jupyter Notebook 99.97%
python-for-data-analysis jupyter-notebook chinese-translation data-analysis pandas

pydata-notebook's Introduction

利用Python进行数据分析 2017 第二版 (Python for Data Analysis, 2nd Edition)中文翻译笔记

这本书的英文版github仓库:pydata-book

作者Wes McKinney是pandas的创作者,所以书中关于pandas的讲解也是最实用的部分。我也直接联系过了Wes本人,这个笔记不会有任何版权问题,当然,也不会用于任何商业用途。

这本书自2013年第一版发行后,就广受好评。第一版的时候作者用的是Python2,不过随着Python2的维护年限将近(2020),以及Python3的推广,整个社群向Python3转变已经成为不可扭转的趋势。所以在第二版里,作者使用了Python3.6。而我实际写的代码则是基于Python3.5,使用上没有任何差别。

2017第二版主要更新:

  1. 所有代码,包括Python教程,都升级到了Python3.6(第一版用的是Python2.7)
  2. 更新了Python的安装介绍。这次改用Anaconda Python发行版,以及其他一些需要的Python包
  3. 使用了最新的2017版pandas
  4. 新增了一章,用来介绍pandas的高级应用工具,和其他一些有用的小贴士
  5. 简单介绍了如何使用statsmodels和scikit-learn

本来很早就知道这本书了,直到最近才终于有时间,打算把这本书完整过一遍,顺便用jupyter做成笔记方便以后查阅。结果我在看第一版第三章的时,突然发现作者已经在2017年推出了第二版,不过暂时还没有中文版。想了想反正也要做成笔记,索性直接把英文翻译成中文,做一个简洁版的Notebook版本分享出来好了,也算是为开源世界做点小贡献。

在写笔记的时候,我尽量写中文,不过有一些专有名字我是直接写英文,然后配上中文翻译,毕竟有时候知道英文单词的话查找英文的文档也方便一些,而且我相信这样做对提升中文和英文专业名字的对照关系有帮助。毕竟在程序员的世界里,不懂英语会很艰难,即使是一些简单的单词,也是我们走向新世界的起点。

另外我并不是逐字逐句翻译,因为这样翻译出来的效果洋味十足,很难懂。我尽可能按方便理解的方式进行翻译,其他一些没有用的话我不进行翻译。这本书中的翻译并不是经过特别考究的,内容上也会有很多个人的解释。推荐大家等正式的中文版推出后进行购买,翻译质量肯定会比我的有保障。不过因为是一个人在翻译整本书,工作量比较大,难免有错误和疏漏的地方,或者有读起来觉得奇怪的地方,如果有发现的话不要客气,请尽管说出来,欢迎任何改进和Pull Request。

声明

我的翻译行为完全是出于自己的兴趣,并没有经过国内出版社的授权。经一些朋友的提醒,国外的作者本人是不享有翻译权的,即使我获得了原作者的许可,也不能私自进行翻译。为了尊重版权和国内译者的劳动,这个笔记只保留一部分翻译内容。我挑选了一些相对基础的章节留了下来,可以用于了解Numpy和Pandas,如果想要看完整版内容的话,读者朋友们可以期待即将出版的中文版书籍。

commit 94ab376,我只能帮到这里了

License

The code in this repository, including all code samples in the notebooks listed above, is released under the MIT license. Read more at the Open Source Initiative.

pydata-notebook's People

Contributors

bigpear0201 avatar bramblexu avatar jokermonn avatar kenshintang avatar rookieday avatar supermaxwu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pydata-notebook's Issues

Renaming files to make git clone work on Windows

jvns/pandas-cookbook#25

windows下,git clone 文件名里不能有问号。

error: unable to create file Chapter-01/1.1 What Is This Book About?(这本书是关于什么的).ipynb: Invalid arg
ument
error: unable to create file Chapter-01/1.2 Why Python for Data Analysis?(为什么使用Python做数据分析).ipynb
: Invalid argument
Your branch is up-to-date with 'origin/master'.

本书132页,也就是5.1.2 In[69]在python3.6和3.7下报错

>>> import pandas as pd
>>> import numpy as np
>>> pop={'Nevada':{2001:2.4,2002:2.9},'Ohio':{2000:1.5,2001:1.7,2002:3.6}}
>>> pd.DataFrame(pop,index=[2001,2002,2003])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 348, in __init__
    mgr = self._init_dict(data, index, columns, dtype=dtype)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 459, in _init_dict
    return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 7359, in _arrays_to_mgr
    arrays = _homogenize(arrays, index, dtype)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 7661, in _homogenize
    oindex = index.astype('O')
AttributeError: 'list' object has no attribute 'astype'```

报错如上, 最新版的pandas 0.23.4 python3.6和3.7下都报错

善意的提醒

同学,你的开源精神是好的。但我要善意的提醒你版权问题。翻译权并不在原作者手中,而是由国内出版社买断的。你没有取得出版社翻译合同的话,擅自翻译,可能会引起版权纠纷。

第八章 join翻译

第八章 join应该对应关系数据库中的join, 所以应该翻译成"连接"

7.2 离散化区间

原文写为"18 ~ 25, 26 ~ 35, 36 ~ 60, >60",应为"19 ~ 25, 26 ~ 35, 36 ~ 60, >60"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.