Go string 一清二楚

mb601ce0d29b15f · · 629 次点击 · · 开始浏览

这是一个创建于的文章，其中的信息可能已经有所发展或是发生改变。

watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk= 字符串（string）作为 go 语言的基本数据类型，在开发中必不可少，我们务必深入学习一下，做到一清二楚。

gopher

前言

字符串（string）作为 go 语言的基本数据类型，在开发中必不可少，我们务必深入学习一下，做到一清二楚。

本文假设读者已经知道切片（slice）的使用，如不了解，可阅读 Go 切片基本知识点

为了更好的理解后文，推荐先阅读 Unicode 字符集，UTF-8 编码

是什么

In Go, a string is in effect a read-only slice of bytes.

在 go 语言中，字符串实际上是一个只读的字节切片，其数据结构定义如下：

// runtime/string.go
type stringStruct struct {
	str unsafe.Pointer	// 指向底层字节数组的指针
	len int				// 字节数组的长度 
}

注意：byte 其实是 uint8 的类型别名

// byte is an alias for uint8 and is equivalent to uint8 in all ways. It is
// used, by convention, to distinguish byte values from 8-bit unsigned
// integer values.
type byte = uint8

怎么用

func main() {
	// 使用字符串字面量初始化
	var a = "hi,狗"
	fmt.Println(a)

	// 可以使用下标访问，但不可修改
	fmt.Printf("a[0] is %d\n", a[0])
	fmt.Printf("a[0:2] is %s\n", a[0:2])
	// a[0] = 'a' 编译报错，Cannot assign to a[0]
    
    // 字符串拼接
	var b = a + "狗"
	fmt.Printf("b is %s\n", b)

	// 使用内置 len() 函数获取其长度
	fmt.Printf("a's length is: %d\n", len(a))

	// 使用 for;len 遍历
	for i := 0; i < len(a); i++ {
		fmt.Println(i, a[i])
	}

	// 使用 for;range 遍历
	for i, v := range a {
		fmt.Println(i, v)
	}
}


/* output
hi,狗

a[0] is 104
a[0:2] is hi

b is hi,狗狗

a's length is: 6

0 104
1 105
2 44
3 231
4 139
5 151

0 104
1 105
2 44
3 29399
*/

如果读者在看上面的代码时有疑惑，不用着急，下文将会挨个解读。

只读

字符串常量会在编译期分配到只读段，对应数据地址不可写入，相同的字符串常量不会重复存储

func main() {
	var a = "hello"
	fmt.Println(a, &a, (*reflect.StringHeader)(unsafe.Pointer(&a)))
	a = "world"
	fmt.Println(a, &a, (*reflect.StringHeader)(unsafe.Pointer(&a)))
	var b = "hello"
	fmt.Println(b, &b, (*reflect.StringHeader)(unsafe.Pointer(&b)))
}

/* output
字符串字面量 该变量的内存地址 底层字节切片
hello 0xc0000381f0 &{5033779 5}
world 0xc0000381f0 &{5033844 5}
hello 0xc000038220 &{5033779 5}
*/

可以看到 hello 在底层只存储了一份

for;len 遍历

go 的源代码都是 UTF-8 编码格式的，上例中的”狗“字占用三个字节，即 231 139 151（Unicode Character Table），所以上例的运行结果很清楚。

于此同时，也可以将字符串转化为字节切片

func main() {
	var a = "hi,狗"
	b := []byte(a)
	fmt.Println(b)	// [104 105 44 231 139 151]
}

for;range 遍历

The Unicode standard uses the term "code point" to refer to the item represented by a single value.

在 Unicode 标准中，使用术语 code point 来表示由单个值表示的项，通俗点来说，U+72D7（十进制表示为 29399）代表符号 ”狗“

"Code point" is a bit of a mouthful, so Go introduces a shorter term for the concept: rune.

code point 有点拗口，所以在 go 语言中专门有一个术语来代表它，即 rune

注意：rune 其实是 int32 的类型别名

// rune is an alias for int32 and is equivalent to int32 in all ways. It is
// used, by convention, to distinguish character values from integer values.
type rune = int32

在对字符串类型进行 for;range 遍历时，其实是按照 rune 类型来解码的，所以上例的运行结果也很清晰。

与此同时，也可以将字符串转化为 rune 切片

func main() {
	// 使用字符串字面量初始化
	var a = "hi,狗"
	r := []rune(a)
	fmt.Println(r) // [104 105 44 29399]
}

当然我们也可以使用 "unicode/utf8" 标准库，手动实现 for;range 语法糖相同的效果

func main() {
	var a = "hi,狗"
	for i, w := 0, 0; i < len(a); i += w {
		runeValue, width := utf8.DecodeRuneInString(a[i:])
		fmt.Printf("%#U starts at byte position %d\n", runeValue, i)
		w = width
	}
}

/* output
U+0068 'h' starts at byte position 0
U+0069 'i' starts at byte position 1
U+002C ',' starts at byte position 2
U+72D7 '狗' starts at byte position 3
*/

有疑问加站长微信联系（非本文作者）

本文来自：51CTO博客

感谢作者：mb601ce0d29b15f

查看原文：Go string 一清二楚

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

629 次点击

加入收藏微博

收入我的专栏

上一篇：go好用的类型转换第三方组件

下一篇：聊聊dubbo-go-proxy的ZookeeperRegistryLoad

切片

代码

slice

内存地址

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

关注我

扫码关注领全套学习资料
加入 QQ 群：
- 192706294（已满）
- 731990104（已满）
- 798786647（已满）
- 729884609（已满）
- 977810755（已满）
- 815126783（已满）
- 812540095（已满）
- 1006366459（已满）
- 692541889
加入微信群：liuxiaoyan-s，备注入群
也欢迎加入知识星球 Go粉丝们（免费）

Go string 一清二楚

前言

是什么

怎么用

只读

for;len 遍历

for;range 遍历

用户登录

今日阅读排行

一周阅读排行

关注我

前言

是什么

怎么用

只读

for;len 遍历

for;range 遍历

Go string 一清二楚

前言

是什么

怎么用

只读

for;len 遍历

for;range 遍历

用户登录

今日阅读排行

一周阅读排行

关注我

给该专栏投稿 写篇新文章

收入到我管理的专栏 新建专栏

前言

是什么

怎么用

只读

for;len 遍历

for;range 遍历

给该专栏投稿写篇新文章

收入到我管理的专栏新建专栏