Comments (15)
Hi Sorry for the delay!
I tried running on 23.03 container but found the same error.
This is the reproducer code that gives the same CUDA error:
`import numpy as np
import cirq
import qsimcirq
qubits = cirq.GridQubit.rect(5, 6)
gpu_options = qsimcirq.QSimOptions(gpu_mode = 8,max_fused_gate_size = 4)
qsim_simulator = qsimcirq.QSimSimulator(qsim_options=gpu_options)
circuit = cirq.Circuit()
circuit.append([cirq.X(qubits[k]) for k in range(30)])
circuit.append([cirq.depolarize(p = 0.1)(qubits[k]) for k in range(30)])
circuit.append([cirq.measure(qubits[k]) for k in range(30)])
result = qsim_simulator.run(circuit,repetitions=200)`
Thanks!
from cuquantum.
Ah, I will go about acquiring some better GPUs in that case.
Thank you very much for you help.
from cuquantum.
@bramathon closing this issue as the posted problem appears to be fully addressed.
If you have other questions, issues, etc., please feel free to open a new issue/discussion referencing this issue.
from cuquantum.
Hi @kaarthikvarma, can you share more about which backend in the appliance you are using?
from cuquantum.
Thank you for the response!
So I run the container with "docker run --gpus all -it --rm nvcr.io/nvidia/cuquantum-appliance:22.11"
I don't change anything within the container
I have the NVIDIA Driver Version 525.85.12 and CUDA version 12.0
from cuquantum.
Hi @kaarthikvarma Would it be possible to share a reproducer with us to investigate further into this issue? Also, could you try running your code with the latest 23.03 container and see if you encounter any issue? We fixed a few issues on the container offering, which may or may not be relevant for you (hard to judge without a reproducer). Thanks 🙂
from cuquantum.
Hi @kaarthikvarma thanks for your reproducer, it did make it easier for us to reason about the issue. We should be able to fix it in our next container release.
from cuquantum.
Just wanna keep everyone posted: We're still working on the 23.06 cuQuantum Appliance container release which will include the needed bug fix.
from cuquantum.
@kaarthikvarma we've published the 23.06 cuQuantum Appliance container on NGC, here.
You may pull it with:
docker pull nvcr.io/nvidia/cuquantum-appliance:23.06
from cuquantum.
Hi, I've hit a similar issue using the 23.06 cuquantum appliance.
I'm using the ghz.py
script found in the examples folder.
First, I test that I am able to run the example script:
(cuquantum-23.06) cuquantum@5c71780f490b:~/examples$ python ghz.py
q(0),q(1),q(2)=110, 110, 111
Next, I add a single line to the script and saved it as noisy-ghz.py
def main(nqubits=28, nrepetitions=10, ngpus=1):
measure = True if nrepetitions > 0 else False
circuit = make_ghz_circuit(nqubits, measure=measure)
**circuit = circuit.with_noise(cirq.depolarize(p=0.01))**
Running this script gives the error:
(cuquantum-23.06) cuquantum@5c71780f490b:~/examples$ python noisy-ghz.py
custatevec error: internal error statespace_mgpu.h 685
from cuquantum.
@bramathon can you share the full modified example?
from cuquantum.
Thanks for the quick response @mtjrider .
Here is the full script:
# Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
#
# SPDX-License-Identifier: BSD-3-Clause
import argparse
import cirq
import qsimcirq
parser = argparse.ArgumentParser(description='GHZ circuit')
parser.add_argument('--nqubits', type=int, default=3, help='the number of qubits in the circuit')
parser.add_argument('--nsamples', type=int, default=3, help='the number of samples to take')
parser.add_argument('--ngpus', type=int, default=1, help='the number of GPUs to use')
def create_qsim_options(
max_fused_gate_size=2,
disable_gpu=False,
cpu_threads=1,
gpu_mode=(0,),
verbosity=0,
n_subsvs=-1,
use_sampler=None,
debug=False
):
return qsimcirq.QSimOptions(
max_fused_gate_size=max_fused_gate_size,
disable_gpu=disable_gpu,
cpu_threads=cpu_threads,
gpu_mode=gpu_mode,
verbosity=verbosity,
n_subsvs=n_subsvs,
use_sampler=use_sampler,
debug=debug
)
def qsim_options_from_arguments(ngpus):
if ngpus > 1:
return create_qsim_options(gpu_mode=ngpus)
elif ngpus == 1:
return create_qsim_options()
elif ngpus == 0:
return create_qsim_options(disable_gpu=True, gpu_mode=0, use_sampler=False)
def make_ghz_circuit(nqubits, measure=False):
qubits = cirq.LineQubit.range(nqubits)
circuit = cirq.Circuit()
circuit.append(cirq.H(qubits[0]))
circuit.append(cirq.CNOT(qubits[idx], qubits[idx + 1]) for idx in range(nqubits - 1))
if measure:
circuit.append(cirq.measure(*qubits))
return circuit
def main(nqubits=28, nrepetitions=10, ngpus=1):
measure = True if nrepetitions > 0 else False
circuit = make_ghz_circuit(nqubits, measure=measure)
circuit = circuit.with_noise(cirq.depolarize(p=0.01))
qsim_options = qsim_options_from_arguments(ngpus)
simulator = qsimcirq.QSimSimulator(qsim_options=qsim_options)
if nrepetitions > 0:
results = simulator.run(circuit, repetitions=nrepetitions)
else:
results = simulator.simulate(circuit)
print(results)
if __name__ == '__main__':
args = parser.parse_args()
main(nqubits=args.nqubits, nrepetitions=args.nsamples, ngpus=args.ngpus)
from cuquantum.
Thanks!
@bramathon can you tell me what system/GPUs you're using?
from cuquantum.
I may be conflating multiple issues. On first run the ghz.py
example works. However, if I try to run it again I get the following trace:
Traceback (most recent call last):
File "/home/cuquantum/examples/ghz.py", line 72, in <module>
main(nqubits=args.nqubits, nrepetitions=args.nsamples, ngpus=args.ngpus)
File "/home/cuquantum/examples/ghz.py", line 64, in main
results = simulator.run(circuit, repetitions=nrepetitions)
File "/home/cuquantum/conda/envs/cuquantum-23.06/lib/python3.9/site-packages/cirq/work/sampler.py", line 63, in run
return self.run_sweep(program, param_resolver, repetitions)[0]
File "/home/cuquantum/conda/envs/cuquantum-23.06/lib/python3.9/site-packages/cirq/sim/simulator.py", line 72, in run_sweep
return list(self.run_sweep_iter(program, params, repetitions))
File "/home/cuquantum/conda/envs/cuquantum-23.06/lib/python3.9/site-packages/cirq/sim/simulator.py", line 103, in run_sweep_iter
records = self._run(
File "/home/cuquantum/conda/envs/cuquantum-23.06/lib/python3.9/site-packages/qsimcirq/qsim_simulator.py", line 324, in _run
return self._sample_measure_results(solved_circuit, repetitions)
File "/home/cuquantum/conda/envs/cuquantum-23.06/lib/python3.9/site-packages/qsimcirq/qsim_simulator.py", line 445, in _sample_measure_results
results[key][:, i, :] = full_results[:, meas_indices] ^ invert_mask
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
However, the custatevec error: internal error statespace_mgpu.h 685
occurs on the first run as well as subsequent runs.
@bramathon can you tell me what system/GPUs you're using?
Quick dump of my system information:
Client: Docker Engine - Community
Cloud integration: 1.0.17
Version: 24.0.5
bevert@RM-LUBU-F2LPE2E:~$ nvidia-smi
Fri Sep 1 19:09:17 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Quadro P520 Off | 00000000:2D:00.0 Off | N/A |
| N/A 39C P8 N/A / ERR! | 4MiB / 2048MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 3208 G /usr/lib/xorg/Xorg 4MiB |
+---------------------------------------------------------------------------------------+
bevert@RM-LUBU-F2LPE2E:~$ lspci
00:00.0 Host bridge: Intel Corporation Comet Lake-U v1 4c Host Bridge/DRAM Controller (rev 0c)
00:02.0 VGA compatible controller: Intel Corporation CometLake-U GT2 [UHD Graphics] (rev 02)
00:04.0 Signal processing controller: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem (rev 0c)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Intel Corporation Comet Lake Thermal Subsytem
00:14.0 USB controller: Intel Corporation Comet Lake PCH-LP USB 3.1 xHCI Host Controller
00:14.2 RAM memory: Intel Corporation Comet Lake PCH-LP Shared SRAM
00:14.3 Network controller: Intel Corporation Comet Lake PCH-LP CNVi WiFi
00:16.0 Communication controller: Intel Corporation Comet Lake Management Engine Interface
00:1c.0 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #1 (rev f0)
00:1c.4 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #5 (rev f0)
00:1d.0 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #9 (rev f0)
00:1d.4 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #13 (rev f0)
00:1f.0 ISA bridge: Intel Corporation Comet Lake PCH-LP LPC Premium Controller/eSPI Controller
00:1f.3 Audio device: Intel Corporation Comet Lake PCH-LP cAVS
00:1f.4 SMBus: Intel Corporation Comet Lake PCH-LP SMBus Host Controller
00:1f.5 Serial bus controller: Intel Corporation Comet Lake SPI (flash) Controller
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (10) I219-V
02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader (rev 01)
03:00.0 PCI bridge: Intel Corporation JHL6240 Thunderbolt 3 Bridge (Low Power) [Alpine Ridge LP 2016] (rev 01)
04:00.0 PCI bridge: Intel Corporation JHL6240 Thunderbolt 3 Bridge (Low Power) [Alpine Ridge LP 2016] (rev 01)
04:01.0 PCI bridge: Intel Corporation JHL6240 Thunderbolt 3 Bridge (Low Power) [Alpine Ridge LP 2016] (rev 01)
04:02.0 PCI bridge: Intel Corporation JHL6240 Thunderbolt 3 Bridge (Low Power) [Alpine Ridge LP 2016] (rev 01)
05:00.0 System peripheral: Intel Corporation JHL6240 Thunderbolt 3 NHI (Low Power) [Alpine Ridge LP 2016] (rev 01)
2b:00.0 USB controller: Intel Corporation JHL6240 Thunderbolt 3 USB 3.1 Controller (Low Power) [Alpine Ridge LP 2016] (rev 01)
2d:00.0 3D controller: NVIDIA Corporation GP108GLM [Quadro P520] (rev a1)
2e:00.0 Non-Volatile memory controller: Sandisk Corp WD Black SN750 / PC SN730 NVMe SSD
bevert@RM-LUBU-F2LPE2E:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
from cuquantum.
Thanks for the detailed information.
P520 is built with the Pascal architecture. We only support Volta and newer.
We document this here.
I'm actually surprised the code runs at all.
I did run your modified example on a DGX A100 (all 8 GPUs) and confirmed it works. In the test, I used --nsamples 10
and added a line to print the circuit. I've attached the output.
cuquantum-23.06-noisy-ghz-output.txt
from cuquantum.
Related Issues (20)
- Multithreaded cutn optimization issue HOT 1
- Disable slicing fails
- Releasing `qsim_mgpu` source on GitHub instead of only binaries the Docker container HOT 2
- cuQuantum MPS Simulator vs Qiskit Aer HOT 3
- gpu issue on qiskit - aer method , docker - cuquantum-appliance:23.10 HOT 1
- Website is down HOT 3
- Functions for arithmetic operations of two tensor networks HOT 3
- Discussion: Change to link statically to cudart? HOT 1
- Strange behavior with diagonal gates. HOT 15
- Issue with using the Cirq frontend simulate(program=...) HOT 4
- Using cuQuantum Appliance 23.03 with Apptainer/Singularity HOT 8
- Distributed MPI simulation: cudaErrorInvalidResourceHandle HOT 9
- Compiling cuStateVec with CMake HOT 3
- Demo of setting a basic memory handler
- [Performance] cuTN circuit2einsum slower than opt_einsum HOT 5
- [Feature] [Unprioritized] CircuitToEinsum: batched expectation values
- [Feature] [Unprioritized] Rust Language Support HOT 1
- [Question] Issues building cuquantum-python from source
- Request for releasing a new version of cuQuantum Appliance HOT 4
- `CircuitToEinsum` fails for some qiskit `QuantumCircuit` HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cuquantum.