Coder Social home page Coder Social logo

Comments (61)

paoliniluis avatar paoliniluis commented on July 1, 2024 6

We’re working on this as we speak

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024 4

ok we have some strings to pull from now

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024 1

@alperengunes if you don't send us what we need, we won't be able to identify the problem

from metabase.

rob-pomelo avatar rob-pomelo commented on July 1, 2024 1

also seeing significant performance regression since going from 0.49 -> 0.50, cpu usage spiking, will try get some logs

from metabase.

perivamsi avatar perivamsi commented on July 1, 2024 1

@alperengunes could you click on the "download diagnostics info" and send over that file to us after reviewing that there is no sensitive information there?

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024 1

Also seeing performance regression and CPU spikes.

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024 1

What can I share? It seems like it's mostly affecting large dashboards with a lot of questions. Looks like it's the metadata endpoint mostly.

Screenshot 2024-06-18 at 02 05 30 PM@2x
Screenshot 2024-06-18 at 02 07 23 PM@2x

Even loading the dashboard spikes CPU usage of our pods:
Screenshot 2024-06-18 at 02 08 05 PM@2x

from metabase.

matotias avatar matotias commented on July 1, 2024 1

@matotias and what engine/sizing are you using as well

We went from 0.49.12 to 0.50.4 and the spikes started, then we updated again to 0.50.5 but we're still seeing them

We use postgres, 4 cpu 16 gigs of ram

from metabase.

matotias avatar matotias commented on July 1, 2024 1

Update on our case: we went back to v0.49.12 and the database load is normal again

from metabase.

psneha716 avatar psneha716 commented on July 1, 2024 1

Hi,
I tried disabling query parsing using MB_SQL_PARSING_ENABLED=false but it's of no help. Metabase is still crashing with OOM.
Version: v0.50.1

Context:

Metabase deployed on ECS with EC2.
CPU per task: 3800 units (3.711 vCPU)
Memory per task: 3700 MiB (3.613 GB)

2 tasks deployed currently

from metabase.

cheyuriy avatar cheyuriy commented on July 1, 2024 1

Similar issue with CPU spikes after upgrading from 0.49.x to 0.50.6.
Using Postgres 14 as a DB. Metabase is deployed on 2CPU-12GB in GCE and DB is in CloudSQL.

Upgrade was in the end of 06/24.

Screenshot 2024-06-25 at 11 12 36

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024 1

@uladzimirdev I had to clear my local browser cache to resolve the issue. We have metabase behind a google cloud CDN but that didn't be the issue.

from metabase.

noahmoss avatar noahmoss commented on July 1, 2024 1

Hi @psneha716 — I don't see any errors in the downgrading logs that indicate specifically why it didn't work. But I did fix an issue related to downgrading in Metabase 50.7, the latest release. Could you try starting that version up, and then running the downgrade command with it?

If that doesn't work could you open a new issue with the full logs?

from metabase.

uladzimirdev avatar uladzimirdev commented on July 1, 2024

@alperengunes could you please add more details about your configuration?

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

@alperengunes please post the logs, specs... at least you're not running Metabase correctly there since you're not probably tweaking the heap limit

from metabase.

alperengunes avatar alperengunes commented on July 1, 2024

In 49.x updates I don't have this problem, but in 50.x updates I have this problem. There are no different changes I have made.

from metabase.

alperengunes avatar alperengunes commented on July 1, 2024

not only this but I have been getting expression error since I updated metabase 50.x

image
image

from metabase.

perivamsi avatar perivamsi commented on July 1, 2024

@LukeAbell any diagnostics you can share with us that can help debug this?

from metabase.

rob-pomelo avatar rob-pomelo commented on July 1, 2024

also running in gke and have had similar cpu spike on upversion just like ^

from metabase.

uladzimirdev avatar uladzimirdev commented on July 1, 2024

@rob-pomelo https://www.metabase.com/docs/latest/troubleshooting-guide/diagnostic-info

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

@perivamsi Just emailed support logs for the dashboard.

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

@LukeAbell @rob-pomelo and @alperengunes can you add MB_SQL_PARSING_ENABLED=false, reboot and then let us know if this keeps happening?

@alperengunes
2024-06-18 14:33:00,039 WARN malli.fn :: Invalid output
ERROR middleware.add-source-metadata :: Error determining expected columns for query
clojure.lang.ExceptionInfo: Error preprocessing query in metabase.query_processor.preprocess
Caused by: clojure.lang.ExceptionInfo: Invalid query

please post the questions that spit those lines

from metabase.

matotias avatar matotias commented on July 1, 2024

Not sure if it's the same issue, but we started seeing huge spikes in the database after updating, these updates seem to be the problem
image

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

@matotias what version did you upgrade from?

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

@matotias and what engine/sizing are you using as well

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

@paoliniluis Didn't seem to change much. I think it's permission or sandbox related because it's much faster as a super admin vs a normal user that's sandboxed using full app embed.

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

Check out the difference in time between sandboxed (right) and super admin (left).
Screenshot 2024-06-18 at 03 26 53 PM@2x

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

@LukeAbell I'm trying to reproduce what you're seeing and I'm wondering how it's taking 4 seconds to get a 30kb response. Can you describe a little more the queries? how are queries built (based on questions or models?). Are tables big? many joins?

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

@matotias and what engine/sizing are you using as well

We went from 0.49.12 to 0.50.4 and the spikes started, then we updated again to 0.50.5 but we're still seeing them

We use postgres, 4 cpu 16 gigs of ram

4 cpu and 16 gigs of ram is the postgres sizing right?

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

@paoliniluis want to hop on a quick video chat?

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

@LukeAbell give me a couple of hours to try to go to the call with some answers, can you make it tomorrow first time in the morning?

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

@paoliniluis Tomorrow at 10am EST work?

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

@LukeAbell I'm trying to reproduce what you're seeing and I'm wondering how it's taking 4 seconds to get a 30kb response. Can you describe a little more the queries? how are queries built (based on questions or models?). Are tables big? many joins?

Queries are based on questions (no models involved). Not a ton of joins. Using bigquery- table has about 800k rows.

Only 6 tables are returned in that query_metadata call.

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

Thanks @LukeAbell, do you have observability in the app db to see if there's something weird there?

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

Yep- doesn't seem like anything weird there. Less than 20% utilization. Running MySQL.

from metabase.

matotias avatar matotias commented on July 1, 2024

@matotias and what engine/sizing are you using as well

We went from 0.49.12 to 0.50.4 and the spikes started, then we updated again to 0.50.5 but we're still seeing them
We use postgres, 4 cpu 16 gigs of ram

4 cpu and 16 gigs of ram is the postgres sizing right?

Yes!

from metabase.

alperengunes avatar alperengunes commented on July 1, 2024

@alperengunes could you click on the "download diagnostics info" and send over that file to us after reviewing that there is no sensitive information there?

checking

from metabase.

alperengunes avatar alperengunes commented on July 1, 2024

@LukeAbell @rob-pomelo and @alperengunes can you add MB_SQL_PARSING_ENABLED=false, reboot and then let us know if this keeps happening?

@alperengunes 2024-06-18 14:33:00,039 WARN malli.fn :: Invalid output ERROR middleware.add-source-metadata :: Error determining expected columns for query clojure.lang.ExceptionInfo: Error preprocessing query in metabase.query_processor.preprocess Caused by: clojure.lang.ExceptionInfo: Invalid query

please post the questions that spit those lines

not working :'( same

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

@matotias is your app db a postgres or mysql?

from metabase.

matotias avatar matotias commented on July 1, 2024

postgres

from metabase.

naveenthontepu avatar naveenthontepu commented on July 1, 2024

Metabase is crashing every 3 to 4 hours in our case also. There is no significant jump in the CPU utilisation but the RAM utilisation has significantly increased and is consuming the complete 4 gb ram we have on the metabase server.

Looking at the tasks running on our server by running "htop" we are seeing that there are too many tasks metabase is keeping active even when there is nobody using metabase.

from metabase.

perivamsi avatar perivamsi commented on July 1, 2024

@naveenthontepu could you send us the output of htop and logs of the tasks that Metabase is keeping active?

from metabase.

paoliniluis avatar paoliniluis commented on July 1, 2024

Also: any specific environment variable that you’re using?

from metabase.

naveenthontepu avatar naveenthontepu commented on July 1, 2024

Screenshot 2024-06-19 at 5 29 24 PM

There are no environment variables that we are using.

the command we are using to run metabase specifically mentions to use only 2gb of the 4gb ram but it is not being considered and using it 4gb and crashing. In the above screenshot also we can see that metabase is consuming 71% memory which is ~3gb.

from metabase.

noahmoss avatar noahmoss commented on July 1, 2024

@LukeAbell what version of MySQL are you on?

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

@noahmoss 8.0.31

from metabase.

perivamsi avatar perivamsi commented on July 1, 2024

@LukeAbell we have some changes coming that that might improve the performance. would you be able to upgrade to 50.7 and test? which version are you on right now? is it easy for you to downgrade if 50.7 is still slow for you?

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

@perivamsi yep just let me know when there's a tagged release. Or do I need to test on a custom branch?

from metabase.

perivamsi avatar perivamsi commented on July 1, 2024

no need to test a custom branch, I'll give you a proper release link once ready

from metabase.

shrey-locad avatar shrey-locad commented on July 1, 2024

exactly same problem since the upgrade to 0.50. memory and cpu spiking intermittently bringing down the whole container every 3-4 hours. We will give it 1-2 more days before downgrading back to 0.49

from metabase.

perivamsi avatar perivamsi commented on July 1, 2024

@shrey-locad sorry about that! are you open to having a call that can help us debug this better? as I said, we are working on this but it would be helpful to have additional data points around the memory and CPU spikes. also, hi from another IITB alum!

from metabase.

shrey-locad avatar shrey-locad commented on July 1, 2024

@perivamsi - haha, that's a pleasant surprise. Hi 👋🏽

It might be hard to debug over a call, especially since the crashes are unpredictable and intermittent. We have been unable to identify a specific trigger either. The latest crash was in the middle of the night where we're pretty sure none of our users were online. How about we gave you ssh access to the instance instead? Call works too. Either way. [email protected]

from metabase.

perivamsi avatar perivamsi commented on July 1, 2024

That works, we just need to look at the logs. I'll email you.

from metabase.

perivamsi avatar perivamsi commented on July 1, 2024

@LukeAbell @shrey-locad @cheyuriy can you all please upgrade to 50.7 which has some fixes for the performance issues?

https://github.com/metabase/metabase/releases/tag/v0.50.7

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

@perivamsi I upgraded but had to quickly downgrade because it completely broke the application. Downgrading back to v1.50.6 fixed the issue. I tried clearing our CDN cache and flushing browser cache but it didn't help.

Assets wouldn't load giving an error:

Refused to execute script from 'HOST/app/dist/runtime.ba18610064ff9fba6745.js' because its MIME type ('') is not executable, and strict MIME type checking is enabled.

from metabase.

perivamsi avatar perivamsi commented on July 1, 2024

We're on it

from metabase.

bshepherdson avatar bshepherdson commented on July 1, 2024

That sounds like #39449. If force-refresh doesn't work, try another browser?

It does seem like it was a transitory issue in the past.

from metabase.

LukeAbell avatar LukeAbell commented on July 1, 2024

@perivamsi @bshepherdson Completely cleared cache and that worked. Unfortunately not much of an improvement in performance (if any) 😔

  • Query metadata taking 4-6 seconds sandboxed
  • Queries taking 3-5 seconds sandboxed
  • CPU usage still spiking like crazy loading a single dashboard

from metabase.

uladzimirdev avatar uladzimirdev commented on July 1, 2024

@LukeAbell just for the record, what kind of cache have you cleared? what CDN do you use? we're struggling to reproduce the issue, so any info would be helpful (some users had it during upgrades from older versions)

from metabase.

psneha716 avatar psneha716 commented on July 1, 2024

Hi folks,
Could someone who has already downgraded their metabase version, help me out with the downgrading process? Tried downgrading, but it isn't working. Any help would be greatly appreciated. Thanks!

Cluster configuration:

ECS cluster with 2 tasks.
Task CPU: 3800 units (3.711 vCPU)
Task memory: 3700 MiB (3.613 GB)

Steps that I followed for downgrading (followed instructions from here):

  • Stopped the container running v0.50.1 metabase instance

  • Ran the following on the EC2 instance directly:

$docker run --rm metabase/metabase:v0.50.1 "migrate down"
Warning: environ value jdk-11.0.23+9 for key :java-version has been overwritten with 11.0.23
2024-06-12 21:08:25,427 INFO metabase.util :: Maximum memory available to JVM: 1.9 GB
2024-06-12 21:08:27,762 INFO util.encryption :: Saved credentials encryption is DISABLED for this Metabase instance. 🔓
 For more information, see https://metabase.com/docs/latest/operations-guide/encrypting-database-details-at-rest.html
2024-06-12 21:08:28,566 WARN db.env :: WARNING: Using Metabase with an H2 application database is not recommended for production deployments. For production deployments, we highly recommend using Postgres, MySQL, or MariaDB instead. If you decide to continue to use H2, please be sure to back up the database file regularly. For more information, see https://metabase.com/docs/latest/operations-guide/migrating-from-h2.html
2024-06-12 21:08:32,573 INFO driver.impl :: Registered abstract driver :sql  🚚
2024-06-12 21:08:32,579 INFO driver.impl :: Registered abstract driver :sql-jdbc (parents: [:sql]) 🚚
2024-06-12 21:08:32,585 INFO metabase.util :: Load driver :sql-jdbc took 32.7 ms
2024-06-12 21:08:32,586 INFO driver.impl :: Registered driver :h2 (parents: [:sql-jdbc]) 🚚
2024-06-12 21:08:32,717 INFO driver.impl :: Registered driver :mysql (parents: [:sql-jdbc]) 🚚
2024-06-12 21:08:32,744 INFO driver.impl :: Registered driver :postgres (parents: [:sql-jdbc]) 🚚
2024-06-12 21:08:34,401 INFO metabase.core ::
Metabase v0.50.1 (cc4ca82)

Copyright © 2024 Metabase, Inc.

Metabase Enterprise Edition extensions are NOT PRESENT.
2024-06-12 21:08:34,657 INFO db.setup :: Setting up Liquibase...
2024-06-12 21:08:35,245 INFO db.liquibase :: Updating liquibase table to reflect consolidated changeset filenames
2024-06-12 21:08:35,254 INFO db.liquibase :: No migration lock found.
2024-06-12 21:08:35,255 INFO db.liquibase :: Migration lock acquired.
2024-06-12 21:08:35,259 INFO db.setup :: Liquibase is ready.
2024-06-12 21:08:35,262 INFO db.liquibase :: No migration lock found.
2024-06-12 21:08:35,263 INFO db.liquibase :: Migration lock acquired.
2024-06-12 21:08:35,264 INFO db.liquibase :: Rolling back app database schema to version 49
  • Tried spinning up a container using v0.49.15, but getting the following error:
2024-06-12 21:10:56,380 INFO metabase.core :: Setting up and migrating Metabase DB. Please sit tight, this may take a minute...
--
2024-06-12 21:10:56,382 INFO db.setup :: �[36mVerifying postgres Database Connection ...�[0m
2024-06-12 21:10:56,724 INFO db.setup :: Successfully verified PostgreSQL 13.13 application database connection. ✅
2024-06-12 21:10:56,725 INFO db.setup :: �[36mChecking if a database downgrade is required...�[0m
2024-06-12 21:10:56,834 ERROR middleware.log :: �[31mGET /api/health 503 287.2 µs (0 DB calls)
{:status "initializing", :progress 0.3}
�[0m
2024-06-12 21:10:56,844 ERROR middleware.log :: �[31mGET /api/health 503 182.0 µs (0 DB calls)
{:status "initializing", :progress 0.3}
�[0m
2024-06-12 21:10:57,344 ERROR metabase.core :: Metabase Initialization FAILED
clojure.lang.ExceptionInfo: �[31mERROR: Downgrade detected.�[0m
Your metabase instance appears to have been downgraded without a corresponding database downgrade.
You must run `java -jar metabase.jar migrate down` from version 50.
Once your database has been downgraded, try running the application again.

from metabase.

psneha716 avatar psneha716 commented on July 1, 2024

Hi @noahmoss ,
It did not work. Created a new issue here.

from metabase.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.