Coder Social home page Coder Social logo

Comments (41)

zooba avatar zooba commented on May 18, 2024 37

I understand the arguments that everyone (that is, all the other package managers) could fix this themselves, but this is definitely in scope for what we want to be working on. We would rather fix as much as possible in the OS so that the benefits are "free", and then look at ways to help individual applications improve their own performance.

For example, we've been tracking Python and Git performance for a while now, and have already (quietly) released some updates that reduce overheads, which will naturally improve Yarn as well. We've also fed back some suggestions to the projects to help them improve things.

But we do believe that to a certain extent, POSIX code should be easy to port to Windows and maintain its performance. And since we're the only ones who can improve Windows itself, we're looking for ways to improve that "certain extent" before we turn back to giving advice directly to projects.

from windows-dev-performance.

asklar avatar asklar commented on May 18, 2024 22

thanks for your input @Qix- ! To answer your question as to why this is a problem for Microsoft: it negatively affects the developer experience of a Microsoft product (one that I work on 🙂), React Native for Windows. Folks coming from a JS background or from a react native on android/iOS background are often disappointed with how slow these tools are which they have to use when they want to port their app to Windows, so we want to make it better for people writing apps for Windows :)
Thanks!

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 18, 2024 19

@Stanzilla - it's in everyone's best interest that Microsoft enables all applications to run as fast as they can on Windows, without having to suffer unexpected perf issues, hence this repo and the program of work backing it ;)

from windows-dev-performance.

Qix- avatar Qix- commented on May 18, 2024 18

I'll be the first person to point out the deficiencies of the Windows platform to anyone who asks, but I really don't see how this is Microsoft's problem.

Further, before I fully respond, I'm a bit confused - your title states "nodejs and yarn" but your post states "npm and yarn". Are you referring solely to the package managers, or are you claming Node.js as a whole is 4x slower?

In either case, there's nothing stopping these applications from profiling and fixing bottlenecks on Windows. It is very clear that, at the foundational level, properly written software can perform just as well on Windows as other platforms. Having dealt with both the npm and yarn source code, both are in very large part bloated and unoptimized (npm especially).


Here's just some stats from both projects.

$ cd /src/npm/cli && cloc bin lib
     204 text files.
     204 unique files.
       0 files ignored.

github.com/AlDanial/cloc v 1.82  T=0.55 s (372.5 files/s, 34479.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
JavaScript                     196           1984            886          15794
Bourne Shell                     4             14              8            117
DOS Batch                        3              8              0             36
Markdown                         1              9              0             28
-------------------------------------------------------------------------------
SUM:                           204           2015            894          15975
-------------------------------------------------------------------------------

$ cd /src/yarnpkg/yarn && cloc src bin
     167 text files.
     167 unique files.
       2 files ignored.

github.com/AlDanial/cloc v 1.82  T=0.50 s (330.3 files/s, 53159.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
JavaScript                     161           4270           2063          20177
Bourne Shell                     1              4              4             27
PowerShell                       1              1              0              5
DOS Batch                        2              0              0              4
-------------------------------------------------------------------------------
SUM:                           165           4275           2067          20213
-------------------------------------------------------------------------------

15-20 thousand lines of code without node modules being installed.

Further, in npm's case, the size with node modules is incredibly huge (yarn is about 1/6th the size in comparison):

$ ccd /tmp/npm && npm i npm && du -sh .
31M     .

What's more, you should also look at the platform abstract library that Node.js runs on top of - libuv. They have to emulate many parts of Unix on Windows for API compatibility,


While source code size isn't really a great indicator of what the program does, it's very clear there's a lot of surface area for things to be misused here.

In my opinion, it's not Microsoft's job to profile third-party applications and provide speed improvements - especially when those applications are written with several layers of abstraction between them and the operating system in question. Since this isn't showing proof that Node.js as a platform is slower, I would chalk this up to poor software design on the package managers' parts.

You should, however, definitely bring these findings up with the npm, yarn and node.js teams respectively, as they should be the ones spearheading a profiling initiative if one is really needed.

from windows-dev-performance.

Qix- avatar Qix- commented on May 18, 2024 18

Right, but just because something runs slow on windows, doesn't necessarily mean it's Microsoft's problem to solve. There are thousands of variables that could make it run slower. Unless you're arguing that Node.js as a platform, as a rule runs slower on Windows and not the individual applications built on top of Node (npm and yarn), then I don't see how this is a platform problem.

Npm is a fully-fledged company with funding and investors - if their commercial product is running slowly, they are fully capable of fixing it.

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 18, 2024 15

Great issue, thanks for filing @asklar, and as @zooba says, this is indeed a great example of the kind of issue we're looking for!

It's quite possible that there's something in the implementation or port of node/npm/yarn that is inherently slow on Windows.

As I've discussed at length in other issues (e.g. #15), POSIX-first apps do tend to suffer on Windows, for a number of reasons. We're very keen to learn all the reasons why, and to figure out what can be done to reduce/eliminate the root-causes of the perf issues for such apps.

We believe we have a pretty good understanding of the key root-causes, but want to be sure we're not missing anything unexpected.

So, we're measuring problem scenarios (on normalized hardware to minimize hardware-specific differences) and are working to figure out the root-causes of perf issues such as this. We then plan to work out where best to fix the issue - be it at the node/npm/yarn/libuv end, or in Defender/Windows/etc. ... or both!

If the issue lays entirely/partly in an open-source project, we're happy to do our part as members of the open-source community to work with project owners and other community participants to implement a solid fix for the issues we find.

from windows-dev-performance.

warpdesign avatar warpdesign commented on May 18, 2024 9

From my experience this appears to be related to disk performance and is not specific to npm: using npm to install new packages downloads and uncompresses lots of small files and Windows is not fast (to say the least) at creating lots of small files.

Disabling Windows Defender's real-time check mitigates the problem a little, but it remains very slow compared to other platforms (be it Mac or Linux).

from windows-dev-performance.

Stanzilla avatar Stanzilla commented on May 18, 2024 8

npm is owned by GitHub which is owned by Microsoft so there should be a case where it is in Microsoft's interest to make npm faster on Windows just to address the concerns at the start of the conversation. That case however must not be in Windows itself, which kinda makes this issue questionable in terms of scope.

from windows-dev-performance.

nmoinvaz avatar nmoinvaz commented on May 18, 2024 7

Why does it take so long to delete node_modules in Windows? It seems like the OS could just delete the directory file system record, instead of scanning some 90k files to prepare for delete..

from windows-dev-performance.

asklar avatar asklar commented on May 18, 2024 6

It's unusual for folks to argue against us fixing something 😂, usually it's the other way around

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 18, 2024 5

@nmoinvaz - are you trying to delete node_modules using File Explorer or from the command-line?

If the former, it's because File Explorer does a ton of work under the hood to enumerate & track all the files to be deleted, calculate and continually update estimated time to complete, etc.

Deleting these files from the command-line should be MUCH quicker:

  • PowerShell: rm node_modules -force -recurse
  • Cmd: del /s /f node_modules

from windows-dev-performance.

zooba avatar zooba commented on May 18, 2024 4

Having said that, file explorer and features of the shell are outside the scope of this repo's issues :)

Perhaps, but we've got file system folk handy (on internal threads) if there are ways we can change both sides at once. And I can't imagine any user being upset by delete (especially recycle) being quicker. We're not talking about milliseconds worth of improvements here, but minutes worth of "progress" which prevent the top-level folder name being reused until it's done (which is really what we want to do - the actual files can silently delete in the background over the next hour for all we care, provided we can put a different node_modules directory in its place).

@bitcrazed is the centre of the web, so he can pull together those responsible for a chat.

from windows-dev-performance.

zakius avatar zakius commented on May 18, 2024 3

while it may not be Microsoft's responsibility to fix the performance issue with X its MS' problem as I've seen enough people being like "I moved to Y (often being Linux but sometimes also Mac) cause X was slow on Windows"

A gentle nudge from a big company can do miracles sometimes, and if that doesn't help there's always chance to contribute (like MS already does to many open source projects that may or may not be used in their products), it works pretty much the same way as leaving exceptional ways of handling broken software that called undocumented kernel functions to not hurt customers who rely on that piece of software.

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 18, 2024 2

Yep - which is why we're particularly keen to identify reproducible perf issues - so that we can root-cause the issue and either fix whatever is the root cause if within Windows itself, and/or work with projects/partners to fix and improve things in their stuff too.

from windows-dev-performance.

asklar avatar asklar commented on May 18, 2024 2

@Bosch-Eli-Black same reasons I described above apply to copy, delete, etc. File Explorer does a lot more than the file system operations; whether that is desirable or not is a question about whether you want the additional features/UX that it enables :)

from windows-dev-performance.

zooba avatar zooba commented on May 18, 2024 2

I can see how those reasons would apply to delete operations, but I'm having a hard time envisioning how they would apply to copy operations

The ability to see progress and cancel the operation is typically what adds the most overhead. It shouldn't be terrible at a per-file level, but when you include the progress callbacks at all, a number of kernel-level optimisations have to be skipped/hampered (see CopyFile2 for some more context around what I mean).

After that, the difference is probably that Explorer resolves via the shell (which includes the file system, as well as a range of potential shell namespaces) while xcopy only supports the file system. Essential for many applications (e.g. copying from a camera over MTP) but does add overheads.

Also worth noting that in the general case, progress indicators make people feel like the operation is quicker, even if it's technically slower. So guess how to reduce the number of complaints about slow copies 😉

For the uninitiated, such as myself: Dev Drive documentation 🙂

Wish I could've mentioned this to you on the other thread, but we had to wait for the announcement! Dev Drive has a few additional benefits over just being a separate volume (and ReFS has some of its own perf wins over NTFS), so would recommend using it when you can.

from windows-dev-performance.

asklar avatar asklar commented on May 18, 2024 2

pre-existing issue to track possible improvements to explorer-issued operations

@zakius can you please file a bug in feedback hub so the file explorer team can look at this feature request?

it seems like copying wouldn't need to update Quick Access, jump lists, etc.

@Bosch-Eli-Black unfortunately most - if not all - file operations need to touch the MRU/quick access lists. You copied a file, so it indicates that the source file is something you care about, so we "give it an extra point" in the relevance algorithm.

Agree with @zooba re: copy callbacks (a lot of work happens in each callback, actually) as well as the extra features the shell namespace exposes.

from windows-dev-performance.

asklar avatar asklar commented on May 18, 2024 1

@nmoinvaz I happen to have worked in file explorer for a number of years so I can tell you that often a "quick delete" is wrong in the context of the experience file explorer wants to give you. For example, file explorer needs to keep track of your often accessed locations/recently accessed files for showing them in Quick Access. Deleting a folder via file explorer means that now we have to go update the storage for this information, which takes additional time. The shell file operations engine is highly extensible, apps can register hooks to get notified when something changes, when files get deleted, etc. On top of that, as it was mentioned before, we have to do a first traversal to know how much stuff you have to know how to better estimate how long it will take. If you don't care about these features you can del or rd the folder as it was suggested.
Having said that, file explorer and features of the shell are outside the scope of this repo's issues :)

from windows-dev-performance.

andreujuanc avatar andreujuanc commented on May 18, 2024 1

I stopped working directly on windows FS and moved all my code to WSL2 with dev containers. I know not all workflows will fit this, but this is much better than doing npm i natively, for example rm -rf . Whatever emulation overhead you get it's more than balance by the sheer speed of the Linux filesystem.
The biggest plus is that you still get to use Windows, and all the goodies that come with it (steam, office, ransomware).

This is even better for folks that work with python, because each container can have it's dependencies pre-installed, no need for switching environments.

Downside is that you'll need a bit more ram, but the WSL2 vm is VERY VERY lightweight.

https://github.com/microsoft/vscode-dev-containers

from windows-dev-performance.

nphmuller avatar nphmuller commented on May 18, 2024 1

Really looking forward to see if the new Dev Drive feature improves performance for this scenario! 👀

from windows-dev-performance.

Eli-Black-Work avatar Eli-Black-Work commented on May 18, 2024 1

For the uninitiated, such as myself: Dev Drive documentation 🙂

from windows-dev-performance.

zakius avatar zakius commented on May 18, 2024 1

@Bosch-Eli-Black same reasons I described above apply to copy, delete, etc. File Explorer does a lot more than the file system operations; whether that is desirable or not is a question about whether you want the additional features/UX that it enables :)

maybe we could add a toggle to disable these things during long operations? I don't really need estimates or speed or even items counter, if I have to perform an operation it needs to take as long as it needs and it won't change anything for me, and if that helps making things faster I'd gladly opt-out, this includes emitting events if possible

and personally I opted-out of all MRU I could globally so maybe that doesn't need to be touched either?

and yeah, it's not directly related, but do we have pre-existing issue to track possible improvements to explorer-issued operations?

from windows-dev-performance.

Eli-Black-Work avatar Eli-Black-Work commented on May 18, 2024 1

Wish I could've mentioned this to you on the other thread, but we had to wait for the announcement! Dev Drive has a few additional benefits over just being a separate volume (and ReFS has some of its own perf wins over NTFS), so would recommend using it when you can.

@zooba No worries! 🙂

In the meantime, I took your suggestion from the other thread, and well... wow, what a difference it makes! I'll reply to that in the other thread, though, so we don't spam too many people here, haha. (#87 (comment), for those who are interested. Definitely worth checking out 🙂)

from windows-dev-performance.

Eli-Black-Work avatar Eli-Black-Work commented on May 18, 2024 1

@gethari I've tried it, and it seems to help quite a bit! 🙂

Specs:

  • Windows 10
  • Laptop, plugged in, throttling (battery) set to "Best performance"
    C: and D: are on the same SSD
    • C: has operating system and is NTFS
    • D: is ReFS

Results of running yarn install:
C: (NTFS): 22s
D: (ReFS): 9.8s

Having my code on a non-OS partition led to a 55% speedup 🙂

This was done on a project where all of the .zip dependencies were in the yarn cache, so essentially yarn is spending all of its time creating and populating node_modules\.

Edit
Clarified that I think the root cause of this performance increase is from the code being on a non-OS partition. I'd previously said that the speedup was due to using ReFS.

from windows-dev-performance.

nmoinvaz avatar nmoinvaz commented on May 18, 2024

Yes, in File Explorer. Part of the problem is if I accidently start deleting node_modules from the File Explorer dialog it takes a while to respond to even my cancel request.

Often times if a folder I am deleting is too big File Explorer, will ask me something like, "Folder is too big, deleting this will be permanent, do you want to continue?". Along the same lines, couldn't File Explorer ask me something like, "Folder has many files, do you want to quick delete it?"

from windows-dev-performance.

nmoinvaz avatar nmoinvaz commented on May 18, 2024

@bitcrazed should I file a new GH issue or no?

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 18, 2024

@nmoinvaz - I was about to reply with a polite ask to please file new issues in their own issue rather than piggybacking on existing unrelated issues 😜

Please do file another issue and I'll summarize the points above in reply to your issue description.

from windows-dev-performance.

bitcrazed avatar bitcrazed commented on May 18, 2024

@asklar THIS! ;)

from windows-dev-performance.

last-Programmer avatar last-Programmer commented on May 18, 2024

I am also facing this issue. Our gulp task cold build takes 90 second in Mac and ubuntu. But in windows it takes 120 seconds.

from windows-dev-performance.

edmunds22 avatar edmunds22 commented on May 18, 2024

Yeah, in my experience windows is rubbish for node development. Either its slow, or the node-gyp issues which are just painful.

from windows-dev-performance.

last-Programmer avatar last-Programmer commented on May 18, 2024

In apple m1 max it is even better it only takes 68 seconds. this project is react with 16000 ts/tsx files.

from windows-dev-performance.

mcamprecios avatar mcamprecios commented on May 18, 2024

Is there a list of steps to try to improve performance?

I got a gig that uses Gulp in Node 14.20... It takes about two minutes in compiling 90 Pug templates with Sass. I'm basically not able to work nor to understand how it is possible.

I use Win10 in a AMD Ryzen 7 5800H (3.20 GHz) and 32GB of RAM. So I just want to cry... Tell me I'm not losing a job because I use Windows... :/

from windows-dev-performance.

asklar avatar asklar commented on May 18, 2024

@kram08980 are you able to try building inside WSL2?

from windows-dev-performance.

Eli-Black-Work avatar Eli-Black-Work commented on May 18, 2024

@asklar, I've seen some discussion in this issue about deleting node_modules, but I haven't seen much discussion about copying a folder with tons of small files. Any idea why copying via xcopy would be 50% faster than copying via File Explorer?

My setup:

  1. Copying from a non-OS partition to the same non-OS parition (Due to @zooba's comment at #87 (comment))
  2. Windows 10, build 19045
  3. .git is ~8.5GB with ~18,000 files.
  4. Copying with File Explorer takes ~45 seconds.
  5. Copying with xcopy .git ..\.git /s /e /h /v /i /k /r consistently takes 30 seconds

from windows-dev-performance.

Eli-Black-Work avatar Eli-Black-Work commented on May 18, 2024

Actually, maybe that whole question should have been for @zooba, sorry 😆

from windows-dev-performance.

Eli-Black-Work avatar Eli-Black-Work commented on May 18, 2024

Thanks, @asklar 🙂 I can see how those reasons would apply to delete operations, but I'm having a hard time envisioning how they would apply to copy operations, since it seems like copying wouldn't need to update Quick Access, jump lists, etc.

from windows-dev-performance.

dmichon-msft avatar dmichon-msft commented on May 18, 2024

At its core, the performance of the NodeJS package manager (npm, pnpm, yarn) come down to the performance difference in extracting a TAR to disk on Windows vs. Linux. You can experience the performance gap even with C:\Windows\System32\tar.exe vs. /usr/bin/tar.

The fundamental operation in this scenario is creating lots of directories and files and populating said files with data that is already present in contiguous blocks of RAM. On Linux, the only syscalls used are mkdir, open, write and close, with cost dominated by open.

from windows-dev-performance.

gethari avatar gethari commented on May 18, 2024

Really looking forward to see if the new Dev Drive feature improves performance for this scenario! 👀

By any chance did someone try this ? , how about people who have laptops with only one small SSD say 256 gigs ?

from windows-dev-performance.

warpdesign avatar warpdesign commented on May 18, 2024

Nice!
Would be interesting to see the difference with wsl2/native linux/other os.

from windows-dev-performance.

Eli-Black-Work avatar Eli-Black-Work commented on May 18, 2024

Something I should mention:

I haven't seen any indications that these speed ups are from using ReFS. As far as I can tell, the speedups are from having my code on a non-OS partition. See #87 (comment) and #87 (comment) for more details.

from windows-dev-performance.

zakius avatar zakius commented on May 18, 2024

I wonder what would be the reason behind this and if that can be somehow reduced
VHDX seems like the most reasonable option as it doesn't cause you to do any guesswork while splitting partitions, but that's still not ideal

from windows-dev-performance.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.