Request Details Here's the topic/article I want to write about:

This is great, <a class="user-mention notranslate" data-hovercard-type="user" data-hov

No real word limit or a minimum requirement here, <a class="user-mention notranslate"

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Published ✅ <a href="https://www.getunleash.io/blog/from-the-c

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks for the quick turnover time, <a class="user-mention notranslate" data-hovercard

My experience integrating / using Unleash Edge about community-content HOT 11 CLOSED

njiang747 commented on June 28, 2024 3

My experience integrating / using Unleash Edge

from community-content.

Comments (11)

pransh15 commented on June 28, 2024 1

This is great, @njiang747! 😄
I'll get back to you tomorrow - once the team reviews it and shares their feedback! 🚀

from community-content.

pransh15 commented on June 28, 2024 1

No real word limit or a minimum requirement here, @njiang747

from community-content.

njiang747 commented on June 28, 2024 1

Thanks! Working on a draft for the first post now, hope to have it within the next couple of days.

from community-content.

pransh15 commented on June 28, 2024 1

Hi @njiang747 👋
Thank you for sharing the blog! I've added to this Google Doc for the team to collaborate on: https://docs.google.com/document/d/1q6U66U-H-t9CqBryHG9zVHWA3w6byCjb4Kokv2WN5IU/edit?usp=sharing

Looking forward to getting this published. 🚀

from community-content.

pransh15 commented on June 28, 2024 1

Published ✅

Thank you @njiang747 for your contributions 👏

from community-content.

ivarconr commented on June 28, 2024

This is a strong outline. Make it very easy to follow

from community-content.

pransh15 commented on June 28, 2024

Hey @njiang747 👋

Apologies for the delay here, but the team got back to me with the suggestion that we can split this into two with separate context for each of the blogs:

Blog 1: How my team at Codeium uses Unleash

Brief introduction, high level overview of Codeium as a product (website)
- AI-powered code assistant
- Code autocomplete
Why is Unleash needed?
- A/B experiments to test new features
- As an example, we recently released our fill-in-the-middle model (blog post) after running experiments and seeing improvements in overall performance
Architecture / How is Unleash deployed / integrated?
- Helm deployment in K8s
On user machines
- Node SDK in extension, Go SDK in language server (binary run on the side)
On our servers
- Go SDK in API server that handles requests
[Results]

Blog 2: Why Edge is great for handling high traffic loads

Describing the problem?
- High load on Postgres (> 300 transactions / second)
- Tens of thousands of instances registered
What was the cause?
- Clients all hitting the Unleash server directly
- Metrics were being written through from each client causing high load on the database
How did Edge solve the problems?
- Overview of Edge and how it helps here
- Intermediate / proxy between SDKs and Unleash server
- Handles 10s - 100s of thousands of concurrent requests
- Batches metric writes
- Exposes same API as Unleash server, SDKs don't need to know they're talking to Edge
Edge deployment process / update to existing code (very simple)
- Helm chart
- Change DNS record to point from Unleash server to Edge, immediately moves all traffic to hit Edge without even having to rollout an update to users
Result (> 100x reduction in transactions / second in Postgres)

Looking forward to hearing what you think.

from community-content.

njiang747 commented on June 28, 2024

@pransh15 No worries, this sounds good to me. Do you have a rough estimate of how long you'd want these posts to be?

from community-content.

njiang747 commented on June 28, 2024

Here is a first draft. Happy to iterate on any feedback!

Overview

As a software engineer on the team building Codeium, a free AI-powered code assistant, Unleash has been a powerful tool enabling us to iterate quickly in one of the fastest moving spaces in tech today. At Codeium, we leverage generative large language models to provide a developer toolkit integrated directly into our users’ IDEs, including code autocompletion, natural language code search, powerful chat capabilities, automated refactoring and more! In this nascent and rapidly evolving space it has been crucial to have an agile and efficient development process with a data-driven approach to landing new features and changes.

How do we use Unleash?

This is right where Unleash comes in. It has allowed us to experiment with new features through A/B tests ranging from using new code completion models to including different types of data into the input prompts. For example, when we developed a new fill-in-the-middle model that takes into account the code context both before and after the user’s cursor (see our blog post), we used Unleash to gradually roll out this feature to our user base. Combined with our internal metrics that evaluate performance, we were able to initially deploy to a small fraction of our users, verify the new model performed better, then scale out the feature until all users were migrated over. This was all possible without deploying any incremental updates since changing the rollout percentage was as simple as dragging a slider in the Unleash dashboard.

How is Unleash deployed?

In terms of deployment, Unleash fit in well with our existing stack. We used the Unleash Helm Chart to spin up the necessary components in our Kubernetes cluster with the only external requirements being a Postgres database to back the Unleash server and some way to expose the server external to our cluster for our clients to hit. We already had a Postgres instance and ingress controller, so this just required creating a new database and ingress rule.

How is Unleash integrated?

Integrating Unleash into the various components in our system was one of the more intricate parts of the setup due to our system’s design. Because our code assistant runs as an extension in the user’s IDE (i.e. VSCode, IntelliJ, NeoVim, etc.), we have code running in both the extension on the user’s machine as well as in our API server running in a Kubernetes cluster. Adding another layer of complexity, the extensions also spin up a separate binary known as the Codeium language server which contains the core reusable logic between all the different IDE extensions and runs alongside the extension on the client’s machine. We have various experiments in all 3 of these components and thus our integrations included using the Unleash Node SDK in our VSCode extension and the Unleash Go SDK in both the client-side language server and cloud API server.

Takeaways

Unleash has undoubtedly helped us iterate faster and make informed decisions when shipping new features. After a straightforward deployment and integration period, we had simple and centralized control of our experiments in a single dashboard. It allowed us to scale up and down experiments across our stack without having to release new software, which was especially useful due to the complexities of our system. We plan to share more insights in a follow-up post about the scaling challenges we faced and how we used Unleash Edge to overcome them.

from community-content.

pransh15 commented on June 28, 2024

Thanks for the quick turnover time, @njiang747 👏

I've added it to a Google Doc(link) to help us share comments and collaborate better.

from community-content.

njiang747 commented on June 28, 2024

Here's a draft of the follow up blog post about using Unleash Edge to address a scaling problem I ran into.

Overview

Hello again! If you haven’t gotten a chance yet, check out my previous post on how my team at Codeium uses Unleash to iterate on our AI code assistant. In this article I’ll cover a scaling problem we ran into due to the nature of our setup and how we leveraged Unleash Edge to resolve these issues. As a brief recap of our system architecture, we utilize Unleash in both our cloud API server as well as within the IDE extensions running on our users machines.

What was the problem?

One day our system went down. All requests to the API server were timing out which meant our users weren’t seeing any code completions. After a brief investigation, the source of the problem was identified as an overload on our Postgres instance (which contained databases for both our system’s telemetry as well as Unleash-generated metrics). Analyzing the per-database telemetry we found that a buggy change introduced in the most recent release caused our system to write far more frequently than intended; a quick revert brought our service back to a healthy state. Concerningly though, we saw that the number of transactions per second from Unleash was massive, more than 350 tx / sec (!), which put us dangerously close to overloading our Postgres instance and caused the buggy change to push us over that limit. This load was unexpected and perhaps most worryingly, it seemed to be growing over time.

What was the cause?

Given this tenuous situation, I reached out on the Unleash Discord for help to understand what could be going wrong and how I could fix it. After discussing back and forth with Unleash team members, we identified a clear red flag: our Unleash dashboard showed tens of thousands of Unleash instances registered. This was a direct result of the distributed nature of our system and an unexpected usage of the Unleash SDKs. While in most scenarios, Unleash code running on users’ machines would use the Client SDKs for platforms like iOS, Android, and React, because of the unique nature of running Node-based extensions and Go binaries on our users’ machines, environments typically reserved for server-side software, we had opted for the Server SDKs without understanding the implications. Ultimately this meant that the number of SDK instances scaled with our users, all of the SDK instances hit our Unleash server directly, and metrics from each SDK instance were being directly written through to our Postgres instance leading to the huge number of transactions per second.

How did Unleash Edge solve the problem?

Even after understanding the core problem, we were still left in somewhat of a bind. We couldn’t leverage the Client SDKs (whose metrics would have been aggregated by an Unleash proxy before being passed to the main Unleash server) since none of them supported Node or Go. Additionally, while spinning up an intermediate server in our cloud that clients could fetch experiments from would have reduced the number of Server SDK instances to one, this was also infeasible since we couldn’t accept the additional round trip latency from our users’ machines to our cloud and back; we needed to quickly provide AI-powered code completions on each keystroke after all.

Ultimately the Unleash team suggested using Unleash Edge. Unleash Edge sits between the SDK instances and the main Unleash server and provides a cached read-replica. It is highly performant, being able to handle tens to hundreds of thousands of requests per second, and most importantly it batches metric writes from all the SDK instances connected to it. This limits the number of requests hitting the main Unleash server and transitively the number of transactions in the Postgres database. In short, it promised a solution to all of the problems we were seeing.

How was Unleash Edge deployed and used?

Unleash Edge was extremely easy to deploy using the Unleash Edge Helm Chart. The only configuration needed was to set the upstream Unleash server URL which we set to the Kubernetes-internal address of the Unleash server since they would both run in the same cluster. Best of all, Unleash Edge exposes the same API as the Unleash server. This meant that we could make the transition by simply changing the DNS entry we had for the Unleash server to point to Unleash Edge without having to roll out an update to our users!

Takeaways

After making these changes we saw an immediate effect. The number of transactions per second in the Unleash Postgres database decreased by more than 100x, completely nullifying any concerns we had. The only “downsides” introduced were the additional layer of complexity (which really wasn’t that complex) and the additional propagation latency for experiment toggle updates (since Unleash Edge would have to periodically fetch updates from the Unleash Server and the Unleash SDK instances would then have to fetch the update from Unleash Edge). The latter was a worthwhile trade off and also within our control since we could adjust the refresh interval for both the SDK instances and Unleash Edge.

In closing, I recommend other developers using Unleash to try out Unleash Edge for the scaling advantages it provides and numerous other benefits covered in the docs.

from community-content.

My experience integrating / using Unleash Edge about community-content HOT 11 CLOSED

Comments (11)

Blog 1: How my team at Codeium uses Unleash

Blog 2: Why Edge is great for handling high traffic loads

Overview

How do we use Unleash?

How is Unleash deployed?

How is Unleash integrated?

Takeaways

Overview

What was the problem?

What was the cause?

How did Unleash Edge solve the problem?

How was Unleash Edge deployed and used?

Takeaways

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent