Coder Social home page Coder Social logo

gcloudrig's Introduction

gcloudrig Logo

A collection of bash scripts to help create and maintain a cloud gaming rig in Google Cloud Platfom, on the cheap.

Quickstart

Open in Cloud Shell

Prerequisites

  • A Google Cloud project with an active billing account and GPU Quota.
  • A working bash shell with the gcloud command.

It's also recommended to install the following on your local device (PC, Mac, Android, etc) that you'll be streaming to :

Specs & Costs

You'll be charged for the following resources while your rig is running:

You'll also be charged for the following while your rig is running and at rest:

Cloud responsibly. These scripts are provided as-is, with minimal support. While they're designed to minimise costs at-rest, things may not always go to plan. It's recommended to use a dedicated GCP project and/or billing account with billing alerts to avoid any nasty suprises.

Setup

  • Create a new GCP project
  • Launch Cloud Shell
    • Linux/WSL users: Launch a bash shell locally and run gcloud init
  • Clone this repository:
    $ git clone "https://github.com/gcloudrig/gcloudrig"
    
  • Run setup.sh and follow the prompts
    $ cd "gcloudrig"
    $ ./setup.sh
    
    Created [gcloudrig].
    Activated [gcloudrig].
    
    You can use https://cloudharmony.com/speedtest-latency-for-google:compute to test for latency and find your closest region.
    
    Select a region to use:
    1) asia-southeast1          5) us-central1
    2) australia-southeast1     6) us-east4
    3) europe-west4             7) us-west2
    4) northamerica-northeast1
    #? 2
    
    Would you like to automatically install some things? [y/n] y
    
    1) InstallBattlenet=false  4) ZeroTierNetwork=
    2) InstallSteam=false      5) Done
    3) VideoMode=1920x1080
    #? 1
    
    1) InstallBattlenet=true  3) VideoMode=1920x1080    5) Done
    2) InstallSteam=false     4) ZeroTierNetwork=
    #? 2
    
    1) InstallBattlenet=true  3) VideoMode=1920x1080    5) Done
    2) InstallSteam=true      4) ZeroTierNetwork=
    #? 4
    
    We strongly recommend you create a new ZeroTier network for Gcloudrig
    https://my.zerotier.com/network
    
    Gcloudrig ZeroTier network id [or quit]: abcdef1234567890
    
    1) InstallBattlenet=true             3) VideoMode=1920x1080               5) Done
    2) InstallSteam=true                 4) ZeroTierNetwork=abcdef1234567890
    #? 5
    
    Enabling Gcloudrig software installer...
    Creating instance template 'gcloudrig-setup-template'...
    Creating managed instance group 'gcloudrig-group'...
    
    Done!  Run './scale-up.sh' to start your instance.
    
    
  • Run ./scale-up.sh to start your instance.
    • Your instance will launch and automatically start installing software, which will take around 10-20 mins to finish. Open the Log Viewer to track it's progress.

Connect and finish setup (not automatic, yet)

  • Open your ZeroTier network; scroll down to the Members section and mark the Auth? checkbox next to your gcloudrig. You can verify the correct host by matching the Physical IP against your running compute instances.
  • Use Remote Desktop to connect to your rig with the ZeroTier IP. Your username and password can be found in the logs
  • Finish the Parsec installation by logging in and enabling hosting.
  • Double-click the Disconnect RDP shortcut on the desktop, which will drop your RDP session back to the local screen. This bypasses the windows lock screen, which Parsec doesn't have permission to see.
  • Login to Parsec locally and to connect back to your instance.
  • When you reconnect, the Parsec logo should be running in your rig's system tray. Right-click it, and set it Run when my computer starts.
  • If everything seems stable, double-click Post ZeroTier Setup Security on the desktop to lock down TightVNC and Parsec.

Optional setup

  • Get a free public hostname for your private ZeroTier IP at Duck DNS.
    • If you want a hostname for your dynamic public IP as well, you'll need to install a DDNS client or a startup script.
  • Restrict public access to RDP ports by modifying the default-allow-rdp rule in VPC Firewall.

Starting your rig

Run ./scale-up.sh to start your instance.

After your rig has started, it will create a new games disk or restore an existing one from a snapshot and attach it to itself.

Stopping your rig

Run ./scale-down.sh to shutdown your instance.

Once stopped, it will take a few minutes to pack away the boot disk and games disk. Read What happens when I stop my rig? below for more info.

Troubleshooting

If you're having difficulty connecting with a game streaming client, use RDP or TightVNC to access your machine.

  • RDP can't be used to control your rig's local display (which can upset Parsec, especially if it's stuck on the lock screen). There is a desktop hack to "drop" the remote session to the local display, but it's not always reliable.
  • TightVNC can control the local display and interact with the lock screen, but is less reliable and less secure. It's locked down to your Zerotier network during initial setup, just in case.

If you forget your password, use ./reset-windows-password.sh to get a new one. Note that when you do this, you'll also need to update the password for automatic login (use Start > Run > control userpasswords2)

If you need to setup a custom resolution (e.g. 1800x1200), you might have issues with the native NVidia drivers. Custom Resolution Utility (CRU) works well, and while the automatic options in Parsec should also work you can force it to behave too.

If you need a nuclear option, delete everything and start over with these commands:

$ ./destroy.sh
$ ./setup.sh

Maintainence and FAQ

What happens when I stop my rig?

During the scale-down script, your boot disk (C:) is stored away as a custom image, and your games disk (G:) is stored away as a persistent disk snapshot. These are the only two at-rest costs that should be associated with your rig.

Can I resize my disks?

If you need more space or faster disk performance, you can always increase the size of your disks while your rig is running.

It's recommended to keep usage on your boot disk (C:) as small as possible, since at-rest it's stored as a custom image which has higher pricing than the snapshots used to store the games disk (G:).

To take advantage of the (performance boost)[https://cloud.google.com/compute/docs/disks/performance] from having a larger disk but limit your actual disk usage for at-rest costs, after resize simply shrink the volume back down in Windows Disk Manager.

The maximum resolution is 1366x768 or 1280×1024 or my framerate drops to 15fps after 20 minutes

These are all symptoms of NVIDIA GRID / Quadro Licence failures; the best suggestion is to reinstall the GRID® drivers for virtual workstations and restart your rig.

Where do I find the licenced NVIDIA GRID Drivers?

The easiest way to browse and download the drivers is using the Storage Browser in Google Cloud Console: https://console.cloud.google.com/storage/browser/nvidia-drivers-us-public/GRID

Travelling?

gcloudrig keeps your rig as a boot image and disk snapshot in the same GCE region. To move your rig to a different part of the world, just run ./change-region.sh to change your default region, then run ./scale-up.sh. Restoring snapshots in a different region may incurr network costs, so be careful!

Manual software installation

If you answered No to automatic installation during ./setup.sh, run ./scale-up.sh and a clean Windows Server instance will be created.

Run ./reset-windows-password.sh to get the IP, Username and Password, then connect to your instance with Remote Desktop. See Creating Passwords for Windows Instances and Connecting to Windows Instances for more info.

We recomend the following software, but feel free to find your own:

  • Install GRID® drivers for virtual workstations
  • Install Virtual Audio Cable to use as a virtual sound card
  • Install ZeroTier
  • Install Parsec and setup Autologon to bypass the lock screen on boot.
    • If automatic login fails, or you access your instance with RDP then the lock screen will prevent most streaming software from working. You can use TightVNC to access the lock screen.
    • Alternatively, use RDP and an unlock script to drop the RDP session directly to the local console, securely bypassing the lock screen.
  • Install game clients (e.g. Steam, Battlenet) and enjoy!

Contributing

Gitpod ready-to-code

Pull requests against the develop branch are welcome!

gcloudrig's People

Contributors

akilleen avatar deekue avatar oloflarsson avatar putty182 avatar timsoethout avatar vokativ avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gcloudrig's Issues

snaphot no longer attached

My Games disk isn't attached to the vm anymore. Anyway i can do it manually or do i have to run setup again?

VM loop - Keeps restarting; after all the setup finished

There seems to be an issue where even after all the setup and software installation is finished, the rig/VM keeps restarting randomly (seems every 30 seconds) and I cannot log in fast enough, as it just shuts down before the RDP connects. It doesn't seem to be connected to anything.

I also noticed the logs when finished doesn't tell you the password as it usually does at the end of the software and setup installation.

Avoid orphaned boot disks

To avoid orphaned boot disks from hanging around and consuming space, update instance template config to use --boot-disk-auto-delete and update scale-down scripts to issue gcloud compute instances set-disk-auto-delete --no-auto-delete

What is the point of using ZeroTier?

Since Parsec uses the STUN protocol, what is the need for ZeroTier? I understand that if one is under a double NAT where even STUN breaks down, it makes sense, but AFAIK, this is rarely the case in most residential network situations (which I assume is where most gaming is).

proposal: run a windows KVM with GPU passthrough on a GCP linux instance.

I was able to install Proxmox hypervisor on GCP, using GCPs nested virtualization kvm, and was able to run Windows 7 inside a proxmox VM running on a GCP instance.

It works perfectly! I used Windows 7 just because it was a smaller VM to upload, and easier for me to try.

The beauty of having Proxmox running on GCP is that you can run Proxmox locally on a home computer, do all the installation, create a proxmox backup, upload to a GCP bucket, and restore at your GCP proxmox instance!

I do have gpu passthrough working on proxmox at home, so there's nothing preventing one from doing the same at GCP. In theory, it should just work and you would have a Windows VM with GPU at GCP, runnin on a linux instance, at the price of an linux instance! (no extra cost for windows server license!)

and running it preemptive, it's as cheap as you can get!!

unfortunately I have no time to implement it myself on your gcloudrig project... but if you want the bits and bobs on how to setup the proxmox image, or maybe I can even give you my already done proxmox image if you want... ready to go! ;)

let me known...

-H

ERROR: no regions with accelerator type "nvidia-tesla-t4-vws" found

Activated [gcloudrig].
ERROR: gcloud crashed (ValueError): Invalid header value b'/usr/bin/../lib/google-cloud-sdk/lib/gcloud.py compute accelerator-types list --filter zone:(us-east1-b\nus-east1-c\nus-east1-d\nus-east4-c\nus-east4-b\nus-east4-a\nus-central1-c\nus-central1-a\nus-central1-f'

If you would like to report this issue, please run the following command:
  gcloud feedback

To check gcloud for common problems, please run the following command:
  gcloud info --run-diagnostics

#################################################################
ERROR: no regions with accelerator type "nvidia-tesla-t4-vws" found

using Google Cloud Shell

Resolution stuck on windows 10.

Hey, I have a resolution problem and I was wondering if you could help me with it.
Here's what I did:

  • Create a windows 10 VM in vmware.
  • Export it to a .ovf.
  • Upload it to a google bucket.
  • Create an instance with an nvidia-tesla-t4-vws using the .ovf from the bucket.
  • Once logged into the instance through RDP I followed your manual software installation steps but skipped the ZeroTier part.

After that whenever I log in through Parsec without connecting with RDP before, my resolution is stuck in 1366 x 768. I can't change the resolution through the NVIDIA control panel either. I also noticed I don't have access to NVIDIA RTX workstation panel because it seems I don't have the appropriate license but I think that's normal? What do you think.

PS: When I say my resolution is stuck, I mean that I can't set it higher than 1366 x 768.

ERROR: (gcloud.compute.instance-groups.managed.create) argument --zones: not enough args

@putty182
Error during ./setup.sh

Did the steps exactly

Created [https://www.googleapis.com/compute/v1/projects/cloudgamingrig-223523/global/instanceTemplates/gcloudrig-setup-template].
gcloudrig-setup-template

  • echo 'Creating managed instance group '''gcloudrig-group'''...'
    Creating managed instance group 'gcloudrig-group'...
  • gcloud compute instance-groups managed create gcloudrig-group --base-instance-name gcloudrig --region '' --size 0 --template gcloudrig-setup-template --zones ''
    --format 'value(name)' --quiet
    ERROR: (gcloud.compute.instance-groups.managed.create) argument --zones: not enough args
    Usage: gcloud compute instance-groups managed create NAME --size=SIZE --template=TEMPLATE [optional flags]
    optional flags may be --base-instance-name | --description | --help |
    --region | --target-pool | --zone | --zones
    For detailed information on this command and its flags, run:
    gcloud compute instance-groups managed create --help
    @deekue

gcloudrig_get_accelerator_zones Returns Nonexistent Zone in us-east1

When trying to run setup in us-east1, it fails with the following error:

ERROR: (gcloud.compute.instance-groups.managed.create) Could not fetch resource:
 - Invalid value for field 'resource.distributionPolicy.zones[0].zone': 'https://compute.googleapis.com/compute/v1/projects/massive-seer-267723/zones/us-east1-a'. Zone (us-east1-a) must be a valid zone in region us-east1.

According to the dev docs, there is no us-east1-a.

I have tried setting the zone manually in config.sh but it does not seem to pick it up. What am I doing wrong?

Thanks,
Owen

New deprecation error in shell

When I run ./scale-up (first time, new vm) it gives me this error

WARNING: gcloud compute instance-groups managed wait-until-stableis deprecated. Please usegcloud compute instance-groups managed wait-until --stableinstead.

It still works but is deprecated and I'm assuming will stop working in the future when the deprecated command is removed.

EDIT: New Issue (VM randomly shuts down) when in rdp

Would taking snapshots instead of custom images be more cost-effective?

From this page, I can see that the pricing of snapshots per GB is less than that of custom images:

image
image

It seems that snapshots also incur additional network expenses, though, and I'm not sure whether they require the actual disk to be present the entire time (e.g., breaking the whole point of making an image and deleting the disk when not in use). Thoughts?

gcloudrig-games disk does not create from shapshot

This has happened twice now to me. I start the instance and it boots normally, but it never creates the games disk from the snapshot. I still am trying to figure out why this is happening, but I am still not sure.

Issue with displaying projects in setup.sh

When running ./setup.sh, the projects are listed incorrectly. I don't believe this is a problem on my side.

Created [gcloudrig].
Activated [gcloudrig].
Select project to use:
1) Project
2) First
3) My
4) new project
#? 1

Disk Maintenance - documentation

Hi,
I managed to get this to run. Thanks so much for all your work.

The disk management section of the documentation would be nice to be filled in.
At the moment I am struggling to understand where my setup information is stored, as I need to rerun setup every time I open the console.
But it does reuse the snapshot and the game disk.
Happy to help out with the docs if you want to just give me some pointers.

Setup should block and report progress to user until final steps are complete.

Setup currently just scaffolds an instance template and gives users no immediate indication of setup progress. (Logs work, but aren't as friendly)

Progress updates and powershell execution via metadata polling would allow each setup task/stage to be executed remotely, allowing for realtime feedback on setup progress and opens up opportunites for automated testing.

One to investigate while making the nodejs port!

line 146: boot: command not found [cloud shell]

@deekue (I used your pull request code)

Using Cloud Shell

I got this error: line 146: boot: command not found , when I did ./setup.sh about after it checked for GPU_ALL_REGIONS which worked because I submitted a quota for it

Silent Install for VB-Cable

Hi, sorry for not making this a pull request but I've not yet put the time in to figure out how to use Git properly! I stumbled across this looking for silent install switches for VB-Cable, I found that you can use -i -h to silence it. Also -u -h to uninstall!

Improve label-related warnings

When running setup.sh and scale-up.sh before having run scale-down.sh, one gets the below warning multiple times at different points in time of execution of these scripts. These warnings stop after running scale-down.sh (given that one hasn't run destroy.sh).

WARNING: The following filter keys were not present in any resource : labels.gcloudrig, labels.latest

Warnings are particularly scary in this project because of the risk of high cloud costs.

I suggest adding something like "If you haven't run gcloudrig/scale-down.sh yet, this is expected." to the warning message.

Add `--preemptible` switch as an option in setup.sh

I see with the updated script, it is preemptible now. I was reading that the $300 of free credits do not apply to preemptible machines? Even if I upgrade my account to paid-billing, it won't apply?
image

If so, will you make an option to NOT use preemptible machines with the script?

Consider using terraform instead of the current imperative mechanisms e.g. running `setup.sh` and `destroy.sh`

Why Terraform?

Note that for terraform to truly benefit the user it would require slightly more work for them: they would have to manage their terraform state file and customization (e.g. increases to the size of their disks) files, e.g. by committing those files to their own fork of this repository.

See also:

Issue with Post ZeroTier Setup Security

When finishing the setup, approving the VM on ZeroTIer, and then clicking disconnect RDP, connecting with Parsec, a problem occurs when finishing the last instruction: Post ZeroTier Setup Security.

Here is the error PowerShell gives me

Protect-GcloudrigRemoteAccess : failed to get ZeroTier IPv4 Address
At line:1 char:28
+ &{Import-Module GcloudRig; Protect-GcloudrigRemoteAccess}
+                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
    + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Protect-GcloudrigRemoteAccess

image

When checking the ZeroTier network status using the Show Networks option, the status is OK, Broadcast is ENABLED, bridging is disabled.

A lot of this script is broken atm, and it would be great if it started to work again.

Play nicely with Cloud Shell

On Cloud Shell, named configurations don't persist at all; not even across new "tabs" (tmux windows).

This is why:

## /google/devshell/bashrc.google:60

# Assigns a unique per current session configuration location for Cloud SDK
# tools to isolate independent Developer Shell session from each other.
export CLOUDSDK_CONFIG=$(mktemp -d)

Detecting a stock Cloud Shell (e.g. [ $GOOGLE_BASHRC_SOURCED == 1 ]) and either forcibly recreating our named config, or defaulting to the currently active config, would avoid the need for users to run setup.sh every new console session.

Reported in #36

Make "end session" a desktop icon

Just curious, instead of using cloud functions to cleanup upon shutdown, how about using gcloud alpha cloud-shell + nohup as an alternative to run the cleanup script(snapshots and so on).

Originally posted by @juzkev in #4 (comment)

Implement a desktop shortcut, within the rig, that triggers scale-down.sh activities. One option is to trigger a nohup'd command via gcloud alpha cloud-shell to trigger the scale down (being careful to appease cloud-shell's keep-alive checks)

Bonus points:
make this run only when a human "clicks shutdown" in windows (e.g. explictly only windows-initiated shutdowns, not GCP initiated ones which will occur via ACPI G2 Soft Off)

Doesn't seem to be a way to reliably react only to ACPI G2 via Windows, so perhaps there might be something in GCP that could trigger this.

Not getting though the first part of setup

I cloned the repo, then cd'ed to it, and did gcloud init. Then when I do ./setup.sh it stops and says service usage API has not been used or disabled. I enabled compute engine and service usage apis so what's the problem?

Windows Server 2019 - "Settings" crashes instantly, cannot set display resolution via GUI

Something about disabling the Windows built-in display driver causes the Settings GUI in Windows Server 2019 to crash. Details are in the Windows event log when this occuurs.

As a workaround, screen resolution can be changed on the command line e.g. setres -w 1800 -h 1200 -f

Unfourtunatley there doesn't seem to be a one-shot command to change display scaling outside of this settings app.

Use "image family" instead of absolutely named image

Similar to how snapshots are being created, image names should include a timestamp or psuedo-random string in their names, using a common image family instead (and labels?)

At the moment, the previous image is deleted before the new one is created; while we don't delete the source disk until the image is created, this is still risky and doesn't take advantage of image family settings

To save costs, older images should still be cleaned up immediately; just after a new one exists, no before.

Use cloud functions

This would allow users to setup/start/stop rigs from a wider range of clients, without needing to launch a bash shell, e.g.

  • webhook from a discord bot, IFTTT button, google-oauth'd webpage
  • pub/sub handlers to deal with preemption and maintainence reported by #32 not nessessary, boot disks persist just fine when terminated/stopped, providing you don't delete/replace the instance.

Initial scripts do not fully work with custom images

I created a bare-bones Windows 10 .vmdk image in VirtualBox and imported it to GCP right after finishing the installer setup, then added it to the gcloudrig script.

It is able to boot an instance from it and save the changes during scale-down for future runs. However, the games disk is never created or attached, and the software installer does not work at all. Is this WAI? Do we need any special configuration on our W10 images in order for them to be reachable by the scripts?

Windows Licensing fee doesn't stop when rig shutdown

Just spun up an instance this morning and it looks like the licensing fee is billing even when the rig is off.
I have about 1.43 hours of "On" time and 11.43 hours of Windows licensing fees.
Did I miss a step?

GPUs all regions required even when required quota is met

It sucks, because every time I open a new trial account with the credits I have to wait months and months of putting my quota request in for GPU All regions. I already have the Tesla P4 quota in the region, why do I need GPU ALL regions? I know why, but is it necessary?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.