尝试理解 Graph 和 Session

回首页

这篇文章主要介绍了 tf.graph 的表示的逻辑理解。并有一些样例代码(其实是从官方教程照搬过来的)帮助理解。另外,本篇文章完全参考底下的官方帮助文档,不过因为官方文档讲了太多东西,有些还是作者目前不太用的上的特性,我先放下,不在本篇介绍,待日后我重新用上了,我会更新文档补上该部分内容。

TensorFlow 的结构基于数据流图 (Dataflow graph) 来表示你的模型。这导致你在使用比较底层的模型时,首先需要构架一个数据流图,然后再创建一个 Session 来在一台或分布式电脑的设备上运算这模型。

数据流图是一个常见的用于并行计算的编程模型。在这个数据流图中,节点表示计算单元,边表示数据。 举例说,在 TensorFlow 的图中, tf.matmul 运算会表示为一个由两个输入边一个输出边的小节点。

数据流图有很多优点:

tf.Graph

tf.Graph 包含两种相关的主要信息:

构建图

大部分的 TensorFlow 程序在开始时会有一段用来创建数据流图的代码。你可以使用 API 创建 tf.Operation 作为节点、 tf.Tensor 作为边,并把他们加入到 tf.Graph 实例中。 TensorFlow 提供了一个模型的图,默认情况,你的所有操作的直接在默认图上进行。

给图中的节点、边命名

tf.Graph 给他包含的 tf.Operation 对象定义了一个命名空间。他会自动给每个对象命名,不过也许你自己命名整理起来会更有逻辑性。

c_0 = tf.constant(0, name="c")  # => operation named "c"

# Already-used names will be "uniquified".
c_1 = tf.constant(2, name="c")  # => operation named "c_1"

# Name scopes add a prefix to all operations created in the same context.
with tf.name_scope("outer"):
    c_2 = tf.constant(2, name="c")  # => operation named "outer/c"

    # Name scopes nest like paths in a hierarchical file system.
    with tf.name_scope("inner"):
        c_3 = tf.constant(3, name="c")  # => operation named "outer/inner/c"

    # Exiting a name scope context will return to the previous prefix.
    c_4 = tf.constant(4, name="c")  # => operation named "outer/c_1"

    # Already-used name scopes will be "uniquified".
    with tf.name_scope("inner"):
        c_5 = tf.constant(5, name="c")  # => operation named "outer/inner_1/c"

这样的命名,使得在使用 TensorBoard 这样方便的可视化程序时,能够清晰明了的看到这些效果。关于使用 TensorBoard ,不嫌弃的话可以看看我的 Tensorboard TipsGraph Visualizing Understanding ,又或者是官网的Visualizing your graph

注意 tf.Tensor 也有名字,以 <OP_NAME>:<i> 形式存在。其中:

将操作放在不同的设备

如果你想让程序运行在多台设备上,tf.device 提供了接口,你可以指定操作运行在哪台设备上。一个设备的名称以如下格式表示:

/job:<JOB_NAME>/task:<TASK_INDEX>/device:<DEVICE_TYPE>:<DEVICE_INDEX>

很大程度上你并不需要手动指派,如果你硬要的话:

# Operations created outside either context will run on the "best possible"
# device. For example, if you have a GPU and a CPU available, and the operation
# has a GPU implementation, TensorFlow will choose the GPU.
weights = tf.random_normal(...)

with tf.device("/device:CPU:0"):
    # Operations created in this context will be pinned to the CPU.
    img = tf.decode_jpeg(tf.read_file("img.jpg"))

with tf.device("/device:GPU:0"):
    # Operations created in this context will be pinned to the GPU.
    result = tf.matmul(weights, img)

en

with tf.device("/job:ps/task:0"):
    weights_1 = tf.Variable(tf.truncated_normal([784, 100]))
    biases_1 = tf.Variable(tf.zeroes([100]))

with tf.device("/job:ps/task:1"):
    weights_2 = tf.Variable(tf.truncated_normal([100, 10]))
    biases_2 = tf.Variable(tf.zeroes([10]))

with tf.device("/job:worker"):
    layer_1 = tf.matmul(train_batch, weights_1) + biases_1
    layer_2 = tf.matmul(train_batch, weights_2) + biases_2

en

with tf.device(tf.train.replica_device_setter(ps_tasks=3)):
    # tf.Variable objects are, by default, placed on tasks in "/job:ps" in a
    # round-robin fashion.
    w_0 = tf.Variable(...)  # placed on "/job:ps/task:0"
    b_0 = tf.Variable(...)  # placed on "/job:ps/task:1"
    w_1 = tf.Variable(...)  # placed on "/job:ps/task:2"
    b_1 = tf.Variable(...)  # placed on "/job:ps/task:0"

    input_data = tf.placeholder(tf.float32)     # placed on "/job:worker"
    layer_0 = tf.matmul(input_data, w_0) + b_0  # placed on "/job:worker"
    layer_1 = tf.matmul(layer_0, w_1) + b_1     # placed on "/job:worker"

使用 tf.Session

demo1

x = tf.constant([[37.0, -23.0], [1.0, 4.0]])
w = tf.Variable(tf.random_uniform([2, 2]))
y = tf.matmul(x, w)
output = tf.nn.softmax(y)
init_op = w.initializer

with tf.Session() as sess:
    # Run the initializer on `w`.
    sess.run(init_op)

    # Evaluate `output`. `sess.run(output)` will return a NumPy array containing
    # the result of the computation.
    print(sess.run(output))

    # Evaluate `y` and `output`. Note that `y` will only be computed once, and its
    # result used both to return `y_val` and as an input to the `tf.nn.softmax()`
    # op. Both `y_val` and `output_val` will be NumPy arrays.
    y_val, output_val = sess.run([y, output])

demo2

# Define a placeholder that expects a vector of three floating-point values,
# and a computation that depends on it.
x = tf.placeholder(tf.float32, shape=[3])
y = tf.square(x)

with tf.Session() as sess:
    # Feeding a value changes the result that is returned when you evaluate `y`.
    print(sess.run(y, {x: [1.0, 2.0, 3.0]})  # => "[1.0, 4.0, 9.0]"
    print(sess.run(y, {x: [0.0, 0.0, 5.0]})  # => "[0.0, 0.0, 25.0]"

    # Raises `tf.errors.InvalidArgumentError`, because you must feed a value for
    # a `tf.placeholder()` when evaluating a tensor that depends on it.
    sess.run(y)

    # Raises `ValueError`, because the shape of `37.0` does not match the shape
    # of placeholder `x`.
    sess.run(y, {x: 37.0})

demo3

y = tf.matmul([[37.0, -23.0], [1.0, 4.0]], tf.random_uniform([2, 2]))

with tf.Session() as sess:
    # Define options for the `sess.run()` call.
    options = tf.RunOptions()
    options.output_partition_graphs = True
    options.trace_level = tf.RunOptions.FULL_TRACE

    # Define a container for the returned metadata.
    metadata = tf.RunMetadata()

    sess.run(y, options=options, run_metadata=metadata)

    # Print the subgraphs that executed on each device.
    print(metadata.partition_graphs)

    # Print the timings of each operation that executed.
    print(metadata.step_stats)

同时运算多个图

g_1 = tf.Graph()
with g_1.as_default():
    # Operations created in this scope will be added to `g_1`.
    c = tf.constant("Node in g_1")

    # Sessions created in this scope will run operations from `g_1`.
    sess_1 = tf.Session()

g_2 = tf.Graph()
with g_2.as_default():
    # Operations created in this scope will be added to `g_2`.
    d = tf.constant("Node in g_2")

# Alternatively, you can pass a graph when constructing a `tf.Session`:
# `sess_2` will run operations from `g_2`.
sess_2 = tf.Session(graph=g_2)

assert c.graph is g_1
assert sess_1.graph is g_1

assert d.graph is g_2

Reference

回首页