Coder Social home page Coder Social logo

Multiple database shards about maintenance_tasks HOT 3 CLOSED

sj26 avatar sj26 commented on June 26, 2024
Multiple database shards

from maintenance_tasks.

Comments (3)

etiennebarrie avatar etiennebarrie commented on June 26, 2024

With a custom job class, you can add your own around_perform, right? I guess even without a custom class? e.g.

# config/initializers/maintenance_tasks.rb
Rails.configuration.to_prepare do
  MaintenanceTasks::Task.attribute(:shard, :string)
  MaintenanceTasks::Task.validates(:shard, presence: true) # or better validation
  MaintenanceTasks::Job.around_perform do |job, block|
    shard = job.arguments.first.task
    ApplicationRecord.connected_to(shard:, &block)
  end
end

We haven't looked into multiple databases/sharding because we mostly use custom code for that. I'm not super familiar with sharding in Rails either. But it would be neat for maintenance_tasks to handle shards gracefully. I'm assuming we'd still want maintenance_tasks_runs to be stored in the primary database, and only connect on shards/roles for iterating over records, etc.?

Could we automatically detect existing shards and provide a <select> for users to chose from for example? Would it be possible to have one run for all shards or do we need one run per shard?

from maintenance_tasks.

sj26 avatar sj26 commented on June 26, 2024

Ah, this is an excellent point — I looked at the way job-iteration locked down the perform method and assumed this wasn't possible, but of course an around hook might do nicely. I'll have a play.

I'm assuming we'd still want maintenance_tasks_runs to be stored in the primary database, and only connect on shards/roles for iterating over records, etc.?

Yes our model is we have a core database which contains all tenants and their shard allocations, and then choose shards for vertical slices of our application. Not all maintenance tasks need a selection for each slice. Some tasks require only a single selection, others need to enumerate selections.

We have a small number of shards at the moment, so a shard parameter on the task is enough for the moment. But I'd love to be able to enumerate the shards in the task itself. That's where nested enumerators came to mind — using the first enumerator for the shard, then sub-enumerators for whatever we're enumerating within the shard.

With the recently-landed string based cursors that should be possible. We might be able to prototype this with a custom task job and build_enumerator / enumerator_builder / collection_builder_strategy.

from maintenance_tasks.

sj26 avatar sj26 commented on June 26, 2024

Thanks for the guidance here. We've basically done what you've suggested, something like:

class MaintenanceTaskJob < MaintenanceTasks::TaskJob
   around_perform :switch_shard

   private

   def switch_shard
     # Default to first shard
     shard = :shard_001

     # Allow tasks to parameterize the shard
     # (most easily using `include Maintenance::ShardSelection`)
     if @task.respond_to?(:shard)
       shard = @task.shard
     end

     ShardedRecord.connected_to(shard:, role: ShardedRecord.current_role) do
       yield
     end
   end
 end

I'd love to allow enumerating transparently across shards, but will consider that separately.

from maintenance_tasks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.