Coder Social home page Coder Social logo

Comments (3)

asishm avatar asishm commented on June 15, 2024 2

what's even weirder with to_dict is that pd.NA gets converted to None, whereas np.nan and None are both converted to nan

None doesn't get converted to nan in the to_json step. Rather, it happens in the dataframe constructor. If you craft a dataframe that explicitly hasNone, you'll see that it does return a None.

In [37]: pd.DataFrame({'a': [1, None, pd.NaT, np.nan, pd.NA]}, dtype='object').to_dict(orient='records')
Out[37]: [{'a': 1}, {'a': None}, {'a': NaT}, {'a': nan}, {'a': None}]

from pandas.

Aloqeely avatar Aloqeely commented on June 15, 2024

I agree things should be consistent. what's even weirder with to_dict is that pd.NA gets converted to None, whereas np.nan and None are both converted to nan.

from pandas.

rminkler1 avatar rminkler1 commented on June 15, 2024

I'm seeing a lot of inconsistencies here, and they don't all appear to be pandas related.
Is this a bug, or intentional for some reason?
I'm suspecting this is intentional because data[0]['b'] is numerical, data[1]['b'] as None is treated as numerical or NaN.
With all the json conversions as well, I wouldn't expect, NaN, Null, or None to be maintained.
While it would be ideal for them to match, I'm beginning to thing that won't be possible.

Test:

import pandas as pd
import json

data = [{'a': 1, 'b': 1},  {'a': 2, 'b': None}]
df = pd.DataFrame(
        data
    )
print("input data")
print(data)
print("\nafter conversion to dataframe")
print(df)
print("\nafter conversion to_dict")
print(df.to_dict(orient='records'))
print("\nafter conversion to_json")
print(df.to_json(orient='records'))
print("\nafter python json processing")
print(json.loads(df.to_json(orient='records')))

output:

input data
[{'a': 1, 'b': 1}, {'a': 2, 'b': None}]

after conversion to dataframe
   a    b
0  1  1.0
1  2  NaN

after conversion to_dict
[{'a': 1, 'b': 1.0}, {'a': 2, 'b': nan}]

after conversion to_json
[{"a":1,"b":1.0},{"a":2,"b":null}]

after python json processing
[{'a': 1, 'b': 1.0}, {'a': 2, 'b': None}]

Maybe NaN in Pandas should convert to None with to_dict instead of nan?

from pandas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.