Coder Social home page Coder Social logo

Improve the naming of things? about fastai HOT 15 CLOSED

fastai avatar fastai commented on April 27, 2024 16
Improve the naming of things?

from fastai.

Comments (15)

FabianHertwig avatar FabianHertwig commented on April 27, 2024 4

There is a whole chapter about naming things in the Clean Code book by Robert Martin. In my experience most programmers follow the ideas in that book. So I think we should get rid of all abbreviations. Even lr should be named learning_rate. If it is important that the most incorrect are selected by recall then plot_most_confident_incorrect_by_recall could be the perfect name. It tells you everything you need to know. Otherwise one could name it plot_incorrect and hide the details of how the functions works.

And even you do not have any autocompletion it is important to use good names. Code is read 100 to 1000 times more then it is written. Even by the programmer who wrote the code. This library is starred over 2000 times. So I guess some lines of code are read way over 2000 times. So if you invest a little time in writing good names, you can save a lot of time that is spent reading the code.

The question is, how can we refactor the library to use better names? Should we break the API and just use other names at one point in time?

from fastai.

AdityaSoni19031997 avatar AdityaSoni19031997 commented on April 27, 2024 3

We can create a wiki and list all the shorthands?

from fastai.

hollance avatar hollance commented on April 27, 2024 2

@rohitgeo Agreed on balance. For example, I use lr to mean the learning rate since that is a common abbreviation used in many machine learning APIs. So I wouldn't advocate for writing learning_rate there, that would be taking it too far.

from fastai.

simonm3 avatar simonm3 commented on April 27, 2024 2

I think the short names are better. This is designed for an interactive environment not for an application package. As such the best practices are different.

Interactively you might use learning rate or batch size 100 times in a project as you try multiple alternatives. It is not comparable to an application where you have hundreds of variables used infrequently so longer names make more sense. And these are so core that you cannot seriously forget what lr or bs means unless you are a complete beginner - and you won't be that for long. Personally I am relieved not to have to write ImageDataGenerator().flow_from_directory() multiple times in a notebook.

from fastai.

rohitgeo avatar rohitgeo commented on April 27, 2024 1

I think there's a balance between short and long names. sz and lrs are at one end and plot_most_confident_incorrect_by_recall is at the other end - both can be avoided. Thanks to intellisense and auto-completion in the notebook, we don't have to type out everything, so there's not so much motivation to abreviate everything.

from fastai.

rohitgeo avatar rohitgeo commented on April 27, 2024 1

The goal should be to make the API pythonic and pandorable, not blindly follow any book. PEP8 starts out by saying "A Foolish Consistency is the Hobgoblin of Little Minds." There are abbreviations used in Python and Pandas - so we don't need to be rigid and say "No abbreviations. Period." Such refactorings needs to be decided on a case by case basis by the repo owner - naming of things is not something a committee could ever agree upon.

Coming to the more practical aspect, of refactoring the library vs breaking it, the good thing is that the library is still pre-alpha. It's a good time to make such changes.

In cases where backward compatibility needs to be maintained, a catch all kwargs parameter can be added to methods where parameters are renamed so it continues working with the older names.

from fastai.

rvarbanov avatar rvarbanov commented on April 27, 2024 1

I can't agree more with @FabianHertwig

To add to his comment, the best API is the one that is well adapted. The best way to do that is to make it simple for new adapters to get them up and running. The steeper the learning curve is, the worse adaption rate you are going to have. One of the best ways to lower the learning curve is to make the API self-explaining.

The only benefit of abbreviating having to type fewer characters when writing code. I do not know a programmer that will sacrifice ease of learning a new API for having to type less.

My understanding is that this API is for developers. If that is still true, please make it, so it's easy for developers to learn it.

All API requires some learning, but the best ones are those that take the least time to learn.

from fastai.

jph00 avatar jph00 commented on April 27, 2024 1

This is discussed at various points in the course. I have a strong preference for more to fit in the about of screen space my eye can see at once. I've found that the approaches that work best for me for data science are not the same as those that work best for general software engineering. Unfortunately, few people have written about effective patterns for data science code - although there's a lot of examples in the APL/J/K world of a more extreme version of what I do.

Anyhoo - the naming is very much intentional and based on a couple of decades of both software engineering and data science experience, so I'll be sticking with it. Every variable name is either a mnemonic (lr->learning rate), or is based on standards from the ML and stats literature (x->independent variables; y->dependent variables). They, hopefully, are consistent throughout the code base. I'm happy to take PRs for any examples of inconsistent naming! :)

Many thanks for the discussion.

from fastai.

workflow avatar workflow commented on April 27, 2024 1

What if we take the wiki and turn it into aliases for the shorthand function calls?

So for example, we would have batchnorm_freeze(), which internally simply calls bn_freeze().
The same with longhand versions of shorthand parameters in the method signatures.

That way, @jph00 can keep writing and using the shorthand versions, while there is also a more verbose interface available.

from fastai.

imbolc avatar imbolc commented on April 27, 2024 1

@jph00

I have a strong preference for more to fit in the about of screen space my eye can see at once.

Sure, but in case of notebooks:

  • most of rows there contain a single instruction, with a lot of free horizontal space
  • valuable vertical space mostly spend for output, markdown description and comments (with clear variable names you can even omit some of last ones)

I've found that the approaches that work best for me

  • look at programming languages for example, there are a lot of different style agreements. I think it safe to say that clear variable names are good for all, but abbreviations may be just matter of taste
  • you're writing notebooks for, presumably, newbies in ML and don't know a lot of domain language, so you're making their learning curve steeper

from fastai.

ChrisPalmerNZ avatar ChrisPalmerNZ commented on April 27, 2024

Hmmm, that was interesting to think about - its true that I also wondered about some of the abbreviations, but only once or twice, and then I forgot about it. It does help however, having short names when you are putting statements together on one line... as Jeremy does.

After thinking about them now, I believe that I at least can live with them :) Jeremy's bs is good bs!

from fastai.

bhollan avatar bhollan commented on April 27, 2024

Yes! There's no reason to abbreviate things!

Shameless self-blog-promotion.

from fastai.

apiltamang avatar apiltamang commented on April 27, 2024

I second the above comment. Each and every-time I've encountered the acronym 'bs', I read it as so many people are used to saying it in their daily lives (cue: and that's not batch_size). Can't stop that wisp of weird feeling that follows shortly!

One way that helps is to immediately write documentation for the method. I sought to write a few as I took the class, but there are certainly places where it could improve.

from fastai.

ChrisPalmerNZ avatar ChrisPalmerNZ commented on April 27, 2024

Yeah, that is the point I was trying to make with the observation that "Jeremy's bs is good bs". These abbreviations are used so often, and often in concise code of multiple components, it seems to me to be a no-brainer to see how useful it is to use an abbreviation. To look at, to type, and to get familiar with.

Besides, Jeremy clearly warned us at the beginning that his approach might look quite different than accepted practise!

from fastai.

jph00 avatar jph00 commented on April 27, 2024

@reshamas has done that already :) https://github.com/reshamas/fastai_deeplearn_part1/blob/master/fastai_dl_terms.md

from fastai.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.