Comments (10)
Having done some profiling of that test, this is the top bunch of rows from the perf
output, and it's almost all(!) codegen:
# Overhead Command Shared Object Symbol
# ........ ............... ................................................ ...................................................................................................................................................................................................................................................................................................................................................................................................................................................
#
14.63% chaste_codegen python3.10 [.] _PyEval_EvalFrameDefault
2.18% chaste_codegen python3.10 [.] _PyObject_GenericGetAttrWithDict
2.10% chaste_codegen python3.10 [.] PyDict_SetDefault
1.27% chaste_codegen python3.10 [.] 0x000000000012d010
1.08% chaste_codegen python3.10 [.] 0x000000000012e004
1.03% chaste_codegen python3.10 [.] PyCode_NewWithPosOnlyArgs
0.77% chaste_codegen python3.10 [.] PyUnicode_FromFormatV
0.70% chaste_codegen libc.so.6 [.] __memmove_avx_unaligned_erms
0.68% chaste_codegen python3.10 [.] _PyObject_GetMethod
0.67% chaste_codegen python3.10 [.] 0x000000000012d0bc
0.65% chaste_codegen python3.10 [.] _PyDict_DelItem_KnownHash
0.58% chaste_codegen python3.10 [.] 0x000000000012cb18
0.57% chaste_codegen python3.10 [.] _PyFunction_Vectorcall
0.55% chaste_codegen python3.10 [.] 0x000000000012bc85
0.55% chaste_codegen python3.10 [.] _PyDict_GetItemIdWithError
0.51% chaste_codegen python3.10 [.] _PyObject_MakeTpCall
0.51% chaste_codegen python3.10 [.] _PyObject_GenericSetAttrWithDict
0.50% chaste_codegen python3.10 [.] 0x0000000000121846
0.47% chaste_codegen [unknown] [k] 0xffffffff9e1266ec
0.44% chaste_codegen python3.10 [.] PyType_GenericAlloc
0.43% chaste_codegen libc.so.6 [.] __memset_avx2_unaligned_erms
0.43% chaste_codegen python3.10 [.] PyType_Ready
0.42% chaste_codegen python3.10 [.] PyObject_GetAttr
0.41% chaste_codegen python3.10 [.] PyFunction_NewWithQualName
0.41% chaste_codegen [unknown] [k] 0xffffffff9d98a747
0.40% chaste_codegen libc.so.6 [.] _int_malloc
0.39% chaste_codegen [unknown] [k] 0xffffffff9e12677b
0.39% chaste_codegen [unknown] [k] 0xffffffff9d98ae6a
0.39% chaste_codegen python3.10 [.] _PyObject_GC_NewVar
0.38% chaste_codegen python3.10 [.] 0x000000000012d6d8
0.36% chaste_codegen python3.10 [.] PyObject_Realloc
0.35% chaste_codegen python3.10 [.] PyTuple_New
0.33% chaste_codegen python3.10 [.] PyObject_RichCompare
0.29% chaste_codegen python3.10 [.] 0x0000000000176e16
0.28% chaste_codegen python3.10 [.] PyUnicode_Format
0.26% chaste_codegen python3.10 [.] 0x000000000012d014
0.25% chaste_codegen libc.so.6 [.] __memcmp_avx2_movbe
0.24% chaste_codegen python3.10 [.] PyUnicode_InternInPlace
0.24% chaste_codegen libc.so.6 [.] _int_free
0.23% chaste_codegen python3.10 [.] 0x000000000012d66f
0.23% chaste_codegen python3.10 [.] PyDict_GetItemWithError
0.23% chaste_codegen python3.10 [.] _PyUnicodeWriter_Finish
0.22% chaste_codegen python3.10 [.] 0x000000000012d6d0
0.22% chaste_codegen [unknown] [k] 0xffffffff9e2011b7
0.21% chaste_codegen python3.10 [.] 0x000000000012e008
0.21% chaste_codegen python3.10 [.] 0x000000000015763b
0.21% chaste_codegen python3.10 [.] 0x0000000000137c67
0.20% chaste_codegen python3.10 [.] PyList_Append
0.20% chaste_codegen python3.10 [.] PyObject_GC_Del
0.20% chaste_codegen python3.10 [.] 0x0000000000122a8e
0.20% chaste_codegen python3.10 [.] _PyObject_GC_New
0.19% chaste_codegen libc.so.6 [.] __strlen_avx2
0.19% chaste_codegen python3.10 [.] PyObject_Free
0.19% chaste_codegen python3.10 [.] PyBytes_FromStringAndSize
0.19% chaste_codegen python3.10 [.] _PyObject_LookupSpecial
0.19% chaste_codegen python3.10 [.] _PyUnicodeWriter_PrepareInternal
0.19% chaste_codegen libc.so.6 [.] malloc
0.18% chaste_codegen python3.10 [.] 0x000000000012c764
0.18% chaste_codegen python3.10 [.] PyObject_SetAttr
0.17% chaste_codegen python3.10 [.] 0x0000000000157620
0.17% chaste_codegen python3.10 [.] PyMem_Realloc
0.16% chaste_codegen python3.10 [.] PyImport_ImportModuleLevelObject
0.16% chaste_codegen python3.10 [.] _PyLong_New
0.16% chaste_codegen python3.10 [.] _PyUnicode_JoinArray
0.16% chaste_codegen python3.10 [.] 0x00000000001e5c64
0.15% chaste_codegen python3.10 [.] 0x000000000012c018
0.15% chaste_codegen python3.10 [.] 0x000000000012d252
0.15% chaste_codegen python3.10 [.] 0x000000000012bfd8
0.14% chaste_codegen python3.10 [.] 0x000000000012c058
0.14% chaste_codegen python3.10 [.] 0x0000000000121830
0.14% chaste_codegen python3.10 [.] 0x0000000000176f2c
0.13% chaste_codegen python3.10 [.] 0x0000000000176ff2
0.13% chaste_codegen python3.10 [.] 0x0000000000157c28
0.13% chaste_codegen python3.10 [.] PyObject_IsTrue
0.13% chaste_codegen python3.10 [.] 0x000000000012cce0
0.13% chaste_codegen python3.10 [.] 0x000000000012d730
0.13% chaste_codegen python3.10 [.] 0x000000000015323f
0.12% chaste_codegen python3.10 [.] PyObject_Hash
0.12% chaste_codegen [unknown] [k] 0xffffffff9e20001d
0.12% chaste_codegen python3.10 [.] 0x0000000000157c20
0.12% chaste_codegen python3.10 [.] 0x000000000012bc93
0.11% chaste_codegen python3.10 [.] 0x000000000015fbf9
0.11% chaste_codegen python3.10 [.] 0x0000000000137ca4
0.11% TestOnLatticeSi ld-linux-x86-64.so.2 [.] _dl_lookup_symbol_x
0.11% chaste_codegen python3.10 [.] 0x0000000000070994
0.11% chaste_codegen python3.10 [.] 0x000000000012bf50
0.11% chaste_codegen python3.10 [.] _PyErr_SetObject
0.11% chaste_codegen libc.so.6 [.] unlink_chunk.constprop.0
from chaste.
Initial investigation seems to indicate a significant regression between these two runs:
https://chaste.cs.ox.ac.uk/buildbot/builders/Continuous%20GCC/builds/609
and
https://chaste.cs.ox.ac.uk/buildbot/builders/Continuous%20GCC/builds/610
It looks as though this may be related to the following ticket:
https://github.com/Chaste/trac_archive/blob/master/issues/3070.md
which may be related to this merge:
8f7f7ff
from chaste.
As far as the test suite is concerned, the only test that's taking an absurd amount of time is:
TestCaSingleCellRandomMovement
specifically, it's this for loop (where num_runs == 2000
):
for (unsigned i=1; i<=num_runs; i++)
{
simulator.SetEndTime(delta_t*i);
// Run simulation
simulator.Solve();
TS_ASSERT_EQUALS(simulator.rGetCellPopulation().GetNumRealCells(), 1u);
AbstractCellPopulation<2>::Iterator cell_iter = simulator.rGetCellPopulation().Begin();
unsigned cell_location = simulator.rGetCellPopulation().GetLocationIndexUsingCell(*cell_iter);
TS_ASSERT_LESS_THAN(cell_location, 9u);
location_of_cell[cell_location]++;
// Reset the position of the cell
simulator.rGetCellPopulation().MoveCellInLocationMap(*cell_iter, cell_location, 4u);
TS_ASSERT_EQUALS(simulator.rGetCellPopulation().GetLocationIndexUsingCell(*cell_iter), 4u);
}
I would suggest this is the place to start profiling the code to see exactly what is going slowly. I probably can't do it this afternoon, but I'll add it to the list.
from chaste.
I suspect it is because that new code was opening and closing files repeatedly for each writer, which is called once at the end of a Solve (I think?), but this test somewhat unusually loops the solves 2000 times, meaning maybe tens of thousands of master process opening and closing files?
from chaste.
The long duration certainly seems to be down to resetting the simulation such a large number of times, as can be verified by replacing the test with a single run using a time step of 1 and an end time of 2000.
From the merge mentioned above by @fcooper8472 it isn't obvious to me that the addition of population event writer machinery is the culprit, since there are no population event writers present in TestCaSingleCellRandomMovement
and these writers aren't present by default, unless I've missed something. It would be good to get to the bottom of why it's quite so slow now compared to before that merge.
In any case, if we want to keep the numerical checks in TestCaSingleCellRandomMovement
but only have to run a single simulation with time step 1 and end time 2000, then one way to do this would be to interrogate the output file results.vizlocationindices
after calling Solve()
and tally up the proportion of time steps that the cell had location index 0, 1, 2, etc.
As an aside, it's not really obvious to me what part(s) of the classes AbstractCellBasedSimulation
, OnLatticeSimulation
, or CaBasedCellPopulation
are being tested by TestCaSingleCellRandomMovement
that aren't already covered in other tests in TestOnLatticeSimulationWithCaBasedCellPopulation
- it seems more like a test for DiffusionCaUpdateRule
, but we should really be testing that class fully in the test suite TestCaUpdateRules
.
from chaste.
Doing some investigating it looks like the last fast build was on 19th August after https://chaste.cs.ox.ac.uk/trac/changeset/026854fd5340bcccb926d17663609f74d070cbd8/git_repo (which is the changes Fergus pointed out)
Then the first slow one was after https://chaste.cs.ox.ac.uk/trac/changeset/9b0fa3338afb5f2a00254706230d86927a293f6d/git_repo
Which changed the information for writing to build.info
including a while loop to read a string.
I think this is a bit more costly than what was done before as even though its a cell based test its still outputting build.info so still calling GetChasteCodegenVersion(). This test's calling setup 2000 times so could magnify a small increase to a massive one?
I tested this by commenting out GetChasteCodegenVersion() and got a speedup of a factor of 10, but this is in docker using cmake which I don't think I'm doing with an optimised build (been out the game for a year and finally leaning cmake, baby steps)
If people are happy with a small increase in setup then there's no need to fix anything just rewrite this test to be sequential as Alex suggested.
I personally don't think we need to fix the GetChasteCodegenVersion().
from chaste.
I will admit to seeing a chaste_codegen --version
flying about more than I expected on htop
(more than I expected because it should only really need calling once per build I would have thought, not even at all at runtime*, and certainly not at runtime of the cell based code! Opened a ticket for this #122 )
*possibly complicated by the dynamic conversion of CellML to C++?
But we probably don't need to be writing build.info
again for a second call of Solve()
do we? Or is there no way to tell whether your current state has just been retrieved from an archive and you might be on a different machine?
from chaste.
I've illustrated my proposed solution to #122 in draft PR #123, and after that the same profiling looks like:
# Overhead Command Shared Object Symbol
# ........ ............... ................................................ ...........................................................................................................................................................................................................................................................................................................
#
4.64% TestOnLatticeSi ld-linux-x86-64.so.2 [.] _dl_lookup_symbol_x
4.49% TestOnLatticeSi ld-linux-x86-64.so.2 [.] do_lookup_x
1.67% TestOnLatticeSi libstdc++.so.6.0.30 [.] std::_Rb_tree_increment
1.48% TestOnLatticeSi [unknown] [k] 0xffffffff9e1266ec
1.40% TestOnLatticeSi ld-linux-x86-64.so.2 [.] _dl_relocate_object
0.87% TestOnLatticeSi libc.so.6 [.] __strcmp_avx2
0.80% grep [unknown] [k] 0xffffffff9d26125e
0.76% grep [unknown] [k] 0xffffffff9e1266ec
0.73% TestOnLatticeSi ld-linux-x86-64.so.2 [.] strcmp
0.72% TestOnLatticeSi libc.so.6 [.] __cxa_finalize
0.69% sh [unknown] [k] 0xffffffff9e1266ec
0.66% TestOnLatticeSi ld-linux-x86-64.so.2 [.] check_match
0.62% TestOnLatticeSi libboost_serialization.so.1.74.0 [.] boost::serialization::typeid_system::extended_type_info_typeid_0::is_equal
0.60% TestOnLatticeSi libboost_serialization.so.1.74.0 [.] boost::serialization::extended_type_info::operator==
0.58% grep [unknown] [k] 0xffffffff9e12677b
0.53% TestOnLatticeSi [unknown] [k] 0xffffffff9e20001d
0.46% TestOnLatticeSi libc.so.6 [.] _mcount
0.41% TestOnLatticeSi libboost_serialization.so.1.74.0 [.] boost::serialization::void_cast_detail::void_caster::recursive_unregister
0.40% TestOnLatticeSi libboost_serialization.so.1.74.0 [.] boost::serialization::void_cast_detail::void_caster::recursive_register
0.38% TestOnLatticeSi ld-linux-x86-64.so.2 [.] _dl_close_worker.part.0.isra.0
0.36% grep [unknown] [k] 0xffffffff9d98a747
0.36% TestOnLatticeSi libc.so.6 [.] _int_malloc
0.34% sh [unknown] [k] 0xffffffff9e12677b
0.33% TestOnLatticeSi libc.so.6 [.] malloc
0.32% TestOnLatticeSi libc.so.6 [.] _int_free
0.32% TestOnLatticeSi [unknown] [k] 0xffffffff9e20018a
0.30% TestOnLatticeSi [unknown] [k] 0xffffffff9d98a747
0.30% TestOnLatticeSi libc.so.6 [.] __mcount_internal
This "fixes" the problem, but I think it's still worth discussing whether the test itself is appropriate.
from chaste.
We merged in #123 so this should now be a lot faster, as discussed this morning, a new profiling test should be written to cover this case of lots of cell-based Solve()
calls to check for unexpected extra time in setup/finalise of this kind of simulation.
from chaste.
Immediate issue resolved, but we should keep track to make sure this doesn't happen by accident again (new ticket for this #272)
from chaste.
Related Issues (20)
- Ubuntu 24.04 support HOT 5
- Consider decommissioning old machines (Oxford CS machine rooms) HOT 10
- GitHub Actions Node 16 and Artifacts deprecations
- Ubuntu 22.04 PETSC 3.19.6 compiler warnings HOT 2
- Ubuntu 24.04 PETSC citation update HOT 3
- Enhance GPGPU functionality using FLAME GPU HOT 1
- CVODE 7.0.0 support HOT 1
- Create user tutorial for cell-based simulation command line arguments and parameter sweeping HOT 10
- Mailing list incompatible with Google email HOT 3
- Change default location of Chaste Test Output HOT 3
- Keep track of the time it takes to run Continuous test pack
- PyChaste Merge HOT 1
- Rename `python` folder to `infrastructure` and tidy up
- OpenVT HOT 6
- Out of date Doxygen (Code API) main page. HOT 1
- Broken interaction between std::vector and ublas::c_vector HOT 1
- Compilation error in cell based PDE code HOT 4
- Investigate (and possibly rework) how PDEs are defined on cell based simulations HOT 7
- Test executable not being made.
- Arrange September hackathon HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chaste.