Comments (7)
强转的一个问题是转换后的 byte slice cap 很大,这个是不好的,比如 https://play.golang.org/p/_tqfAgxlZAv ,所以简单粗暴的强转不可取,因为无法拿到 byte slice 的 cap,一个性能较好的实现是 fasthttp 的( https://github.com/valyala/fasthttp/blob/c48d3735fa9864a7c1724168812f3571c8313581/bytesconv.go#L387 )。
from go-questions.
-
是的,官方文档里面已经说明了这个问题:
the Data field is not sufficient to guarantee the data it references will not be garbage collected, so programs must keep a separate, correctly typed pointer to the underlying data.
-- https://golang.org/pkg/reflect/#SliceHeader
原来的代码是错误的。 -
不用这么复杂,可以直接切为 unsafe 强制转换,,而且这种方式更加高效:
func string2bytes(s string) []byte {
return *(*[]byte)(unsafe.Pointer(&s))
}
附:性能对比
// main.go
package main
import (
"reflect"
"unsafe"
)
func string2bytes1(s string) []byte {
stringHeader := (*reflect.StringHeader)(unsafe.Pointer(&s))
var b []byte
pbytes := (*reflect.SliceHeader)(unsafe.Pointer(&b))
pbytes.Data = stringHeader.Data
pbytes.Len = stringHeader.Len
pbytes.Cap = stringHeader.Len
return b
}
func string2bytes2(s string) []byte {
return *(*[]byte)(unsafe.Pointer(&s))
}
// main_test.go
package main
import (
"fmt"
"math/rand"
"reflect"
"testing"
"time"
)
func TestString2Bytes(t *testing.T) {
s := "qcrao/Go-Questions"
if string(string2bytes2(s)) != s {
t.Fatalf("string2bytes2 is not properly implemented")
}
if !reflect.DeepEqual(string2bytes1(s), string2bytes2(s)) {
t.Fatalf("strings2bytes implementation does not match")
}
}
func init() {
rand.Seed(time.Now().UnixNano())
}
var letterRunes = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
func genstring(n int) string {
b := make([]rune, n)
for i := range b {
b[i] = letterRunes[rand.Intn(len(letterRunes))]
}
return string(b)
}
func BenchmarkString2Bytes(b *testing.B) {
funcs := map[string]func(string) []byte{
"string2bytes1": string2bytes1,
"string2bytes2": string2bytes2,
}
for name, f := range funcs {
for i := 1; i < 10000; i *= 10 {
s := genstring(i)
b.Run(fmt.Sprintf("%v-%v", name, i), func(b *testing.B) {
for i := 0; i < b.N; i++ {
f(s)
}
})
}
}
}
$ go test -v -run=none -bench=. -benchmem -count=10 . | tee bench.txt
$ benchstat bench.txt
name time/op
String2Bytes/string2bytes1-1-12 3.07ns ± 1%
String2Bytes/string2bytes1-10-12 3.08ns ± 2%
String2Bytes/string2bytes1-100-12 3.08ns ± 1%
String2Bytes/string2bytes1-1000-12 3.08ns ± 0%
String2Bytes/string2bytes1-10000-12 3.07ns ± 1%
String2Bytes/string2bytes2-1-12 1.95ns ± 2%
String2Bytes/string2bytes2-10-12 1.95ns ± 2%
String2Bytes/string2bytes2-100-12 1.94ns ± 1%
String2Bytes/string2bytes2-1000-12 1.95ns ± 1%
String2Bytes/string2bytes2-10000-12 1.96ns ± 3%
name alloc/op
String2Bytes/string2bytes1-1-12 0.00B
String2Bytes/string2bytes1-10-12 0.00B
String2Bytes/string2bytes1-100-12 0.00B
String2Bytes/string2bytes1-1000-12 0.00B
String2Bytes/string2bytes1-10000-12 0.00B
String2Bytes/string2bytes2-1-12 0.00B
String2Bytes/string2bytes2-10-12 0.00B
String2Bytes/string2bytes2-100-12 0.00B
String2Bytes/string2bytes2-1000-12 0.00B
String2Bytes/string2bytes2-10000-12 0.00B
from go-questions.
@changkun string2bytes2 转换函数严格意义上来讲是错误的,因为转换的时候并未正常给 cap 赋值。
package main
import (
"fmt"
"reflect"
"runtime"
"unsafe"
)
func string2bytes1(s string) []byte {
stringHeader := (*reflect.StringHeader)(unsafe.Pointer(&s))
var b []byte
pBytes := (*reflect.SliceHeader)(unsafe.Pointer(&b))
pBytes.Data = stringHeader.Data
pBytes.Len = stringHeader.Len
pBytes.Cap = stringHeader.Len
runtime.KeepAlive(s)
return b
}
func string2bytes2(s string) []byte {
return *(*[]byte)(unsafe.Pointer(&s))
}
func main() {
s1 := string2bytes1("Roger")
fmt.Println(s1)
fmt.Println(len(s1))
fmt.Println(cap(s1))
s2 := string2bytes2("Roger")
fmt.Println(s2)
fmt.Println(len(s2))
fmt.Println(cap(s2))
}
s2 的 cap 输出将会是一个随机值。
[82 111 103 101 114]
5
5
[82 111 103 101 114]
5
4840475
from go-questions.
@luojiego 不好意思,我认为这是实现者的决策,而不是正确与否的问题。如果我们要讨论「严格意义」上说,你不应该做这种实现,要么老老实实带拷贝的转换,要么用标准库 bytes.Buffer
。
另外,string2bytes1
中的 runtime.KeepAlive(s)
是不必要的。
from go-questions.
@luojiego 不好意思,我认为这是实现者的决策,而不是正确与否的问题。如果我们要讨论「严格意义」上说,你不应该做这种实现,要么老老实实带拷贝的转换,要么用标准库
bytes.Buffer
。另外,
string2bytes1
中的runtime.KeepAlive(s)
是不必要的。
OK,非常感谢!
from go-questions.
为什么 cap 值会这么大?从汇编代码看貌似 cap 值为字符串的 Data 的地址值,但又不是稳定复现的
from go-questions.
为什么 cap 值会这么大?从汇编代码看貌似 cap 值为字符串的 Data 的地址值,但又不是稳定复现的
src/reflect/value.go
有关于 string 的 []byte 的底层结构体定义,因为 []byte 比 string 多了 Cap 字段,如果使用 unsafe 包直接将 string 转换成 slice,必然会导致 Cap 未正确赋值。
// StringHeader is the runtime representation of a string.
// It cannot be used safely or portably and its representation may
// change in a later release.
// Moreover, the Data field is not sufficient to guarantee the data
// it references will not be garbage collected, so programs must keep
// a separate, correctly typed pointer to the underlying data.
type StringHeader struct {
Data uintptr
Len int
}
// SliceHeader is the runtime representation of a slice.
// It cannot be used safely or portably and its representation may
// change in a later release.
// Moreover, the Data field is not sufficient to guarantee the data
// it references will not be garbage collected, so programs must keep
// a separate, correctly typed pointer to the underlying data.
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
from go-questions.
Related Issues (20)
- Page 64 4-3 recv workflow graph
- 第三章:数据容器3.1.2-Page22 HOT 1
- 切片作为函数参数这一章节,倒数第三段描述应该不准确 HOT 1
- 垃圾回收机制图片错误
- go-questions/content/channel/7-操作 channel 的情况总结.md 文章内容有误 HOT 3
- 关于279页贴的Go仓库Issue HOT 1
- 请问一个关于 m 和主线程绑定的问题。(顺便提个 typo)
- 关于 P 的状态流转
- 步调算法下界描述有误
- 有关优雅关闭channel的问题
- 纸质版印刷图片展示不友好
- 《如何实现字符串和byte切片的零拷贝转换》章节,在1.20版之后有变化了 HOT 1
- 《map的实现原理章节》关于slice的描述有点过时了
- 14.1.8 节
- 错误处理的As 和 Is 函数有误
- Page 288 (并发标记清除法的难点是什么)
- Page 289 关于写屏障 HOT 1
- go-questions/channel/graceful-close 文章内容疑似有误 HOT 1
- 1-map B=5时 bucket num 由 hash 的低 5 位决定
- 关于 Golang 内存分配机制 的延伸问题
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from go-questions.