Journal Pre-proof PGNet: A Part-based Generative Network for 3D object reconstruction Yang Zhang, Kai Huo, Zhen Liu, Yu Zang, Yongxiang Liu, Xiang Li, Qianyu Zhang, Cheng Wang
PII: DOI: Reference:
S0950-7051(20)30060-5 https://doi.org/10.1016/j.knosys.2020.105574 KNOSYS 105574
To appear in:
Knowledge-Based Systems
Received date : 25 July 2019 Revised date : 24 January 2020 Accepted date : 25 January 2020 Please cite this article as: Y. Zhang, K. Huo, Z. Liu et al., PGNet: A Part-based Generative Network for 3D object reconstruction, Knowledge-Based Systems (2020), doi: https://doi.org/10.1016/j.knosys.2020.105574. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2020 Published by Elsevier B.V.
*Author Contributions Section
Journal Pre-proof Author Contributions Yang Zhang: Conceptualization, Methodology, Software, Writing- Original draft preparation
Liu Zhen: Investigation and Supervision Yu Zang: Writing- Reviewing and Editing Yongxiang Liu: Validation and Supervision Xiang Li: Validation and Supervision
re-
Qianyu Zhang: Investigation and Editing
pro of
Kai Huo: Data curation and Supervision
Jo
urn a
lP
Cheng Wang: Investigation and Supervision
*conflict of Interest Statement
Journal Pre-proof
Declaration of interests 周The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
pro of
口The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:
re-
We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed.
.1�才 l
加才
1/iJ
l飞、N叫
讪
urn a 训
,
1
λ
IS ,)}ov
2k L γL
lP
Signed by all authors as follows:
't,01/.
IS.IV 》
2,,01
Jo
只伽f
乙j
乌人γ
协丁
l户 寸
J
r
I 、/〉
I {I,
r飞
、
I)>
川,人以1 )W
仅γ
I
Journal Pre-proof *Revised Manuscript (Clean Version) Click here to view linked References
pro of
PGNet: A Part-based Generative Network for 3D Object Reconstruction Yang Zhanga,b , Kai Huob , Zhen Liub , Yu Zanga,∗, Yongxiang Liub , Xiang Lib , Qianyu Zhangc , Cheng Wanga a School
of Information Science and Technology, Xiamen University, Xiamen 361005, China of Electronic Science, National University of Defense Technology, Changsha 410073, China c College of Geography, University of Leeds, Leeds LS2 9JT,UK
b College
re-
Abstract
Deep-learning generative methods have developed rapidly. For example, various single- and multi-view generative methods for meshes, voxels, and point clouds have been introduced. However, most 3D single-view reconstruction methods
lP
generate whole objects at one time, or in a cascaded way for dense structures, which misses local details of fine-grained structures. These methods are useless when the generative models are required to provide semantic information for parts. This paper proposes an efficient part-based recurrent generative network,
urn a
which aims to generate object parts sequentially with the input of a single-view image and its semantic projection. The advantage of our method is its awareness of part structures; hence it generates more accurate models with fine-grained structures. Experiments show that our method attains high accuracy compared with other point set generation methods, particularly toward local details. Keywords: 3D reconstruction, point cloud generation, part-based, semantic
Jo
reconstruction
I Fully
documented templates are available in the elsarticle package on CTAN. author Email address:
[email protected] (Yu Zang)
∗ Corresponding
Preprint submitted to Journal of LATEX Templates
January 24, 2020
Journal Pre-proof
1. Introduction
pro of
As an important branch in computer vision, 3D reconstruction has made great progress. At the same time, deep learning has been widely applied to object generation, and other fields [1, 2, 3, 4]. However, single-view reconstruc5
tion based on deep learning methods still faces many challenges, such as low accuracy and the missing of fine-grained parts.
In this paper, we aim to reconstruct objects from single-view images while retaining more fine-grained details, based on a recurrent generative network. Our method is distinguished by the involvement of the semantic object projection, which can instruct object part generation in a sequential manner. Consequent-
re-
10
ly, the generated objects preserve more details of local structures with a high degree of consistency with the actual models. With the part-based generative approach, we can obtain each semantic part of a model, which has been difficult
15
lP
for previous methods.
In summary, we propose a part-based recurrent generative network, which is the first single-view generative part-based network. Our method achieves high accuracy for both partial and integral similarity,
urn a
with a partial Chamfer distance and repulsion loss.
2. Related Work 20
Single-view reconstruction is a longstanding issue in the 3D vision field, due to the ill-posed problem of one-to-many mapping from 2D images to 3D models. Early work utilized some a priori knowledge to complete the target, such as the texture, specularity, and shadow [5, 6, 7, 8]. However, the requirement of priors
Jo
on natural images has greatly limited the range of their applications.
25
Based on the establishment of the large-scale dataset of 3D CAD models,
ShapeNet [9], many deep-learning methods have been proposed in 3D vision. The start moment has appeared after PointNet [10], which is developed to solve the irregular sequence in point clouds by the maxpooling operation, and achieves better performance on classification and segmentation than previous methods. 2
Journal Pre-proof
30
The scores are improved by PointNet++ [11], with a hierarchical structure to
pro of
perceive larger fields. Single-view 3D reconstruction has developed rapidly. Early work concentrated on generating objects in the grid representation (voxel), which is derived from 2D convolution by using 3D convolutional layers as the basic unit. 3D-GAN [12] utilizes a generative adversarial network to form objects. 35
3D-R2N2 [13] proposes a recurrent architecture to combine multi-view images to construct 3D models. To overcome large storage requirements, OGN [14] is based on an octree to represent and operate in the network. However, the unnatural form of these voxel-based methods and the occupation of much storage
40
re-
space has limited them to larger and finer processing of 3D models.
Later work has tended toward point-based networks for 3D reconstruction. PSGN [15] utilizes a complex 2D generative network to encode nature images, and obtains relatively effect. One contribution is to apply CD as the loss func-
lP
tion to update the network. Afterwards, other clues are involved to improve the results. An unsupervised network [16] is proposed by utilizing the multi-view 45
point projection to obtain the 3D point positions. RealPoint3D [17] aims to reconstruct 3D models from nature images with complex backgrounds by instructing a nearest shape from the ShapeNet. A multi-view projection network [18]
urn a
aims to generate dense point clouds by supervising multiple projections. Although many point cloud generative networks aim to solve the image50
based generation issue, some problems still exist. Local partial structures are usually blurred, and the generated models tend to be closer to the average shape of their category. Our method aims to generate semantic part-aware models.
Jo
3. Methods
Network Architecture. The proposed network consists of three parts: an en-
55
coder, a recurrent feature fusing block, and some part-specialized decoders. The input of PGNet includes the original images and partial projection images, which can be obtained from the semantic projection. Despite the increase of involved data, we can see that the mentioned data are already available online (sec. 4).
3
Journal Pre-proof
Conv2
3
3
Partn
Part2
Out1
Part1-FC1
Part1-FC2
LSTM Block
Part1-Conv5
Original
256
N
256
3
lP
Part1
Part1-Deconv
Data generation
Encoder
128 64
re-
Original
SqueezeNet
Conv4
Conv1
Sequentially
Conv3
pro of
Part1
Part1
LSTM Block
urn a
Sequentially
Partn-FC1
Partn-FC2
LSTM Block
Partn-Conv5
Outn-1
Jo
Partn
Partn-Deconv
Sequentially
256
N
256
3
Partn Part-specialized Decoder
Figure 1: Network architecture
First, the original and partial projection images are fed into the encoder, which
60
contains four convolutional layers and a pre-trained SqueezeNet [19]. The benefit of SqueezeNet is its reduction of network parameters, and it remains com-
4
Journal Pre-proof
petitive with others, such as VGG [20]. Actually, the chair with four parts in
pro of
Fig. 1 is similar to a four-word sentence, which is fed into the network four times. In each iteration, the original RGB image and a corresponding partial 65
projection image are fed into the network, whose features are extracted by the same pre-trained SqueezeNet.
For the former part of the decoder, the partial projection features are concatenated with the feature of the original image, which forms a 512-dimension vector to put into the LSTM block. For subsequent branches, its own extracted 70
feature is concatenated with the previous output of the LSTM block. Such a
re-
design considers the inner association between different parts within an object, utilizing the fusing function of the recurrent networks, which can integrate different parts sequentially with their potential association, such as the space and structure relationships.
lP
Finally, for different parts, we construct several part-specialized decoders
75
to conduct the partial point clouds. By separating the final decoding layers, the network can generate more targeted partial structures and utilize the partinner similarities from different objects. The generated part models are then concatenated to form the whole model. The details of PGNet can been seen in Fig. 1.
urn a
80
Loss functions. The loss function of our network has two parts: the part-based Chamfer distance (CD) and the repulsion loss. First, to measure the similarity of the generated parts and the ground truth, we use CD for each part, which is widely used in single-view reconstruction networks [15, 17]. It can be written
Jo
as
1 X min k x − y k ng x∈g y∈r , 1 X + min k y − x k nr y∈r x∈g
LPi (g, r) =
(1)
where Pi denotes the i − th part of the object. g and r denote generated models
and the reference (ground truth). ng and nr denote the number of points of g
5
Journal Pre-proof
and r. k · k denotes the l2 − norm. The CD loss of the whole object is the average of all parts:
pro of
n
LCD (g, r) =
1X LP (g, r). n i=0 i
(2)
The second part is the repulsion loss, which is used by PU-Net [21]. During experiments, we find the generated points tend to cluster in some local positions. Inspired by PU-Net, we use the repulsion loss to eliminate clustering and form more uniform models. It can be written as ng X
X
i=0 xj ∈K(xi )
kxj − xi k e−
kxj −xi k2 h2
,
re-
LR (g) = −
(3)
where K(xi ) denotes the k − nearest neighborhood of xi . h is a constant parameter with value 0.02. The number of K(xi ) is 5. So, the object function is the sum of the CD loss and repulsion loss:
lP
L = LCD (g, r) + αLR (g),
(4)
where α denotes the weight of the repulsion loss, which is set to 0.02.
urn a
4. Experiments
Dataset. As mentioned above, our method requires labeled semantic parts, and fortunately, there are several approaches to semantic part labeling of objec85
t meshes [22, 23, 24]. We construct our training and testing data from the dataset from [23], which contains shape models of the OFF format and labels of ShapeNetCore. In our experiments, we choose three categories (airplane, car, chair) to evaluate the effectiveness of our approach. Considering the data
Jo
format is mesh, we first sample points on the surfaces uniformly on each part 90
of these individual objects. Then we render these models to get original images and partial projection images. Note that for each part, the numbers of point clouds are equal. In Fig. 1, we circle the data generation procedure on the left. For each model, we get a rendered RGB image and several partial projection images, which are rendered from the labeled models, and we also get
6
Journal Pre-proof
95
a whole object point cloud model and several partial point cloud models, which
pro of
are sampled from the labeled meshes.
Implementation Details. We use PyTorch as our programming framework and Adam [25] as the optimizer, with 10−4 as the decay rate. We train the network on a Titan X GPU and set the batch size to 32 with a total of 100 epochs. 100
The input image is 227 × 227, and the point number of each part is 500. In the encoder, the kernel sizes of the first two convolutional layers are 1 and the last two are 3. The dimensions of input and latent features in the LSTM block are 512 and 256, respectively. In the decoder, one deconvolutional layer and
105
re-
one convolutional layer are mainly used to adjust the point number, with kernel sizes of 5 and 1. The last two fully connected layers are to adjust the channel dimension to get the final three coordinates. ReLU is used as the activation
lP
function.
Comparison of different settings. Single-view networks have an obvious disadvantage; as the training goes on, the network tends to learn the average 110
image-to-model mapping and misses many individual details, hence the generated models for two similar images are also very similar. Our proposed network
urn a
is designed to learn the local partial structures and individual details. To evaluate the part-aware function of PGNet, we compare the part-based and object-level CDs. Specifically, our CD loss is the average of each part, 115
which learns the object structure in a partwise way and produces semantic partial structures, as shown in Eq. 1 and Eq. 2. This owes to the ability of the network to learn the inner association of each part within an object. The result can be seen in Fig. 2. Due to the part-based CD, PGNet can generate
Jo
semantic parts of an object. But the object-level CD misses small structures,
120
resulting in some clusters, scattered parts, and confusing relations of different local structures. The explanation can be derived from the principle of CD scores: for each point in one model, we calculate the shortest distance to another model by searching for the nearest point and calculating the straight-line distance between them. However, there are no explicit correspondences for these points, 7
Journal Pre-proof
125
and the situation worsens when the number of points increases and the two
pro of
shapes differ greatly. So, it is reasonable that the object-based CD has more difficulty finding the nearest point, as the search scope is the whole model, while for the part-based CD, it is easier to find corresponding points and the CD score
(a)
lP
re-
is more accurate.
(b)
(c)
(d)
Figure 2: Comparison of object- and part-level CDs. (a) Input image; (b) Ground truth; (c)
130
urn a
Result of object-level CD; (d) Result of part-level CD.
To demonstrate the function of the repulsion loss, we remove the loss to see a comparison. We can see from Fig. 3 that without the repulsion loss, many clusters appear in the generated models. The repulsion loss acts as a “disperser”, penalizing the situation when some points gather together, and the loss increases as they become closer to each other, which can be seen from Eq. 3. In fact, the
135
repulsion loss is the total of the sums of local weighted distances of each point.
Jo
Here, the fast-decaying weight uses the exponential function based on e, and the local weighted distance decreases with increasing distance. Since the model is normalized, the total summation cannot be very big. From the results, we find that the small-scale parts, such as armrests and legs, tend to form local
140
gatherings, as it is harder to find the corresponding points when calculating the
8
Journal Pre-proof
CD scores, and some points many deem the same point in another model to be
pro of
their nearest corresponding points. When the repulsion loss is removed, some
(b)
lP
(a)
re-
apparent local gatherings come out in the armrest and the leg.
(c)
(d)
Figure 3: Comparison of repulsion loss. (a) Input image; (b) Ground truth; (c) Result without repulsion loss; (d) Result with repulsion loss.
145
urn a
Comparison to other methods. The proposed method is also compared to the recent single-view approach, PSGN [15]. Although some later networks have appeared and achieved slight improvements, considering that our method focuses on part-aware object generation, the CD score is used as the measurement in our work. So, in terms of accuracy, we choose to compare with PSGN to demonstrate the effectiveness of our method. The results can been seen in Fig. 4, 150
Fig. 5, and Fig. 6. We find that PSGN tends to generate the average shape
Jo
of a category and misses many local details, and some point clusters appear, which is similar performance to object-level CD loss. In fact, PSGN, which reconstructs models through the designed deep network all at once, uses the object-level CD to measure the discrepancy between the generated objects and
155
ground truth, while PGNet reconstructs different parts sequentially, retaining semantic and individual partial features. We also find that the performance of 9
Journal Pre-proof
PSGN is worse than the original work [15], since the amount of training and
pro of
testing data is much smaller than in the original ShapeNetCore for certain categories (several hundred versus several thousand), and this also results in clusters 160
in the generated models.
For PGNet, the generated objects have clear local structures and accurate semantic distinctions. As seen in Table 1, PGNet can also perform better than PSGN and PGNet with the object-level CD (PGNet-OCD). PGNet-OCD has similar performance to PSGN, since the object-level CD loss is used in both. 165
PGNet-OCD achieves slightly better results than PSGN, possibly because of its
re-
sequential generative manner with individual partial projection to instruct the
urn a
lP
reconstruction, with a smaller network architecture than PSGN.
(a)
(b)
(c)
(d)
Figure 4: Comparison to PSGN [15] on airplanes. (a) Input image; (b) Ground truth; (c) Result of PSGN; (d) Result of PGNet.
The results of CD scores of different parts can be seen in Table 2, and we
Jo
find that the reconstruction accuracy varies as the categories and parts change.
170
The average CD score reflects the reconstruction quality of different categories, which indicates the reconstructed chair models are not as accurate as those of airplanes and cars. This is because the parts of chairs are more complex, with various spatial structures and volumes, than those of airplanes and cars, while airplanes and cars achieve considerable results. 10
(a)
pro of
Journal Pre-proof
(b)
(c)
(d)
urn a
lP
of PSGN; (d) Result of PGNet.
re-
Figure 5: Comparison to PSGN [15] on cars. (a) Input image; (b) Ground truth; (c) Result
(a)
(b)
(c)
(d)
Figure 6: Comparison to PSGN [15] on chairs. (a) Input image; (b) Ground truth; (c) Result
Jo
of PSGN; (d) Result of PGNet.
175
For airplanes, the four parts are the fuselage, wing, empennage, and power-
plant. We can see that the empennage has the lowest CD score of the four parts, while the powerplant has the highest CD score, which means the powerplant is the most inaccurate part. The intuitive presentation is shown in Fig. 4. In the generation procedure, clear and intact exhibition of the part is the premise to
11
Journal Pre-proof
Table 1: CD scores of different methods. PGNet-OCD denotes PGNet with the object-level
180
pro of
CD loss.
Category
PSGN [15]
PGNet-OCD
PGNet
Airplane
0.0218
0.0205
0.0141
Car
0.0242
0.0228
0.0156
Chair
0.0341
0.0358
0.0315
reconstruct the model accurately. The projection of empennage is more intact
re-
than that of the other parts, with more distinct border lines from the other parts and backgrounds. However, the powerplant is smaller and the right portion is always blocked by the fuselage. As for the other two bigger parts, i.e., the fuselage and wings, due to the larger size and more intersecting lines, they are more easily confused with the other parts, which may lower the reconstruction accuracy.
lP
185
For cars in Fig. 5, the four parts are the roof, hood, tire, and body. The roof is the most accurate part, and the tire is the least accurate part. This is because the roof is on top of the car, with similar sizes, positions, and simple shapes. Half the tire is not visible, and due to the small size, a slight deviation
urn a
190
can lead to higher CD scores. The larger parts, i.e., the roof and hood, occupy more space, making it difficult to find and locate the corresponding points. For chairs in Fig. 6, the four parts are the backrest, seat, leg, and armrest. A chair is hard to reconstruct, since it has more complex structures, especially 195
the armrest, which is the most inaccurate part. With various structures and positions, the armrest is difficult to reconstruct, and due to the similar shapes,
Jo
the backrest is relatively easy. Equipped thin complex structures and various space distribution, parts of the chair achieve higher CD scores than those of the other two; hence the chair is generally hard to reconstruct.
12
Journal Pre-proof
200
Category
part1
part2
Airplane
0.0106
Car Chair
pro of
Table 2: CD scores of different parts. Each object consists of four parts.
part3
part4
average
0.0144
0.0099
0.0214
0.0141
0.0089
0.0103
0.0220
0.0211
0.0156
0.0186
0.0262
0.0395
0.0415
0.0315
5. Conclusion
re-
We have designed a part-based recurrent generation network that is more suitable for 3D fine-grained reconstruction, with the instruction of part-projection images. To make the network perceive local structures, three designs are involved, i.e. the sequential processing based on recurrent networks, the partial Chamfer distance and the repulsion loss. Such designs consider the inner asso-
lP
205
ciation between different parts within an object, which can integrate different parts sequentially with their potential association, such as the space and structure relationships. The generated objects have more uniform distributions and
210
urn a
preserve more details of local structures with partial semantic labels.
6. Acknowledgments
This work was supported by the National Science Foundation of China (Project No.61971363 and 61701191).
References
Jo
[1] P. Gao, R. Yuan, F. Wang, L. Xiao, H. Fujita, Y. Zhang, Siamese atten215
tional keypoint network for high performance visual tracking, KnowledgeBased Systems (2019) 105448doi:https://doi.org/10.1016/j.knosys. 2019.105448.
13
Journal Pre-proof
[2] P. Gao, Q. Zhang, F. Wang, L. Xiao, H. Fujita, Y. Zhang, Learning rein-
220
pro of
forced attentional representation for end-to-end visual tracking, Information Sciences 517 (2020) 52 – 67. doi:https://doi.org/10.1016/j.ins. 2019.12.084.
[3] T. Lai, H. Fujita, C. Yang, Q. Li, R. Chen, Robust model fitting based on greedy search and specified inlier threshold, IEEE Transactions on Industrial Electronics 66 (10) (2019) 7956–7966. 225
[4] T. Lai, R. Chen, C. Yang, Q. Li, H. Fujita, A. Sadri, H. Wang, Efficient
re-
robust model fitting for multistructure data using global greedy search, IEEE Transactions on Cybernetics (2019) 1–13doi:10.1109/TCYB.2019. 2900096.
[5] G. Healey, T. O. Binford, Local shape from specularity, Computer Vision Graphics and Image Processing 42 (1) (1988) 62–86.
lP
230
[6] J. Malik, R. Rosenholtz, Computing local surface orientation and shape from texture for curved surfaces, International Journal of Computer Vision 23 (2) (1997) 149–168.
235
urn a
[7] S. Savarese, M. Andreetto, H. Rushmeier, F. Bernardini, P. Perona, 3d reconstruction by shadow carving: Theory and practical evaluation, International Journal of Computer Vision 71 (3) (2007) 305–336.
[8] R. Zhang, P. Tsai, J. E. Cryer, M. Shah, Shape-from-shading: a survey, IEEE Transactions on Pattern Analysis and Machine Intelligence 21 (8) (1999) 690–706.
[9] A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li,
Jo
240
S. Savarese, M. Savva, S. Song, H. Su, Shapenet: An information-rich 3d model repository, Computer Science.
[10] C. R. Qi, H. Su, K. Mo, L. J. Guibas, Pointnet: Deep learning on point sets for 3d classification and segmentation.
14
Journal Pre-proof
245
[11] C. R. Qi, L. Yi, H. Su, L. J. Guibas, Pointnet++: Deep hierarchical feature
pro of
learning on point sets in a metric space.
[12] J. Wu, C. Zhang, T. Xue, W. T. Freeman, J. B. Tenenbaum, Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling, neural information processing systems (2016) 82–90. 250
[13] C. B. Choy, D. Xu, J. Y. Gwak, K. Chen, S. Savarese, 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction, in: European Conference on Computer Vision, 2016, pp. 628–644.
re-
[14] M. Tatarchenko, A. Dosovitskiy, T. Brox, Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs, arXiv 255
preprint arXiv:1703.09438.
[15] H. Fan, H. Su, L. J. Guibas, A point set generation network for 3d object
lP
reconstruction from a single image., in: CVPR, Vol. 2, 2017, p. 6. [16] E. Insafutdinov, A. Dosovitskiy, Unsupervised learning of shape and pose with differentiable point clouds, in: Advances in Neural Information Pro260
cessing Systems (NeurIPS), 2018.
urn a
[17] Y. Zhang, Z. Liu, T. Liu, B. Peng, X. Li, Realpoint3d: An efficient generation network for 3d object reconstruction from a single image, IEEE Access PP (2019) 1–1. doi:10.1109/ACCESS.2019.2914150.
[18] C.-H. Lin, C. Kong, S. Lucey, Learning efficient point cloud generation for 265
dense 3d object reconstruction, in: AAAI Conference on Artificial Intelligence (AAAI), 2018.
Jo
[19] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, K. Keutzer, Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <0.5mb model size, arXiv:1602.07360.
270
[20] K. Simonyan, A. Zisserman, Very deep convolutional networks for largescale image recognition, international conference on learning representations. 15
Journal Pre-proof
[21] L. Yu, X. Li, C.-W. Fu, D. Cohen-Or, P.-A. Heng, Pu-net: Point cloud
275
pro of
upsampling network, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[22] E. Kalogerakis, A. Hertzmann, K. Singh, Learning 3d mesh segmentation and labeling, international conference on computer graphics and interactive techniques 29 (4) (2010) 102.
[23] E. Kalogerakis, M. Averkiou, S. Maji, S. Chaudhuri, 3d shape segmenta280
tion with projective convolutional networks, computer vision and pattern
re-
recognition (2017) 6630–6639.
[24] L. Yi, L. J. Guibas, A. Hertzmann, V. G. Kim, H. Su, E. Yumer, Learning hierarchical shape segmentation and labeling from online repositories, ACM
285
lP
Transactions on Graphics 36 (4) (2017) 70.
[25] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, inter-
Jo
urn a
national conference on learning representations.
16