Advances in Engineering Software 51 (2012) 20–39
Contents lists available at SciVerse ScienceDirect
Advances in Engineering Software journal homepage: www.elsevier.com/locate/advengsoft
Finite element mesh conversion based on regular expressions Peter Iványi Department of System and Software Technology, Pollack Mihály Faculty of Engineering and Information Technology, University of Pécs, Hungary
a r t i c l e
i n f o
Article history: Received 18 September 2011 Received in revised form 16 May 2012 Accepted 23 May 2012 Available online 19 June 2012 Keywords: Finite element mesh Mesh file format Geometry conversion Topology conversion Conversion of aux data Regular expression
a b s t r a c t The paper presents a method which can convert different types of finite element mesh files to one format. The paper considers only ASCII file representations of the finite element meshes, as this type of representation is usually available for all modeler and simulation software used by engineers. The conversion method is based on regular expressions. The paper also presents a classification of the mesh formats used by several types of engineering softwares. Ó 2012 Elsevier Ltd. All rights reserved.
1. Introduction The finite element mesh is the base of the modelling and the simulation of real life phenomena. Finite element meshes represent real world objects in a simplified and approximate way. Modelling and simulation software handle finite element meshes. There are several types of modelling software and simulation software. An important feature of such software, usually listed among the first features of the program, is the list of formats the software can read or write. There are several file formats, like 3DS, OBJ, VRML, etc. which are common among the modelling software. There is even a greater diversity in the mesh file formats used by researchers in the computational field, like LS-DYNA format, OFF format, etc. A person who wishes to test a model or calculate with a different analysis tool sooner or later has to convert the finite element mesh. The easy way is to use the built-in command of the software, however this is not always possible if the given software does not know how to handle the required mesh format. In this case someone can write a mesh parser, which follows every detail of the specification of the format or a ‘‘quick-and-dirty’’ converter is created for the occasion. This paper presents a methodology which is a middle level solution. The proposed method cannot convert every aspect of every finite element mesh, but it can convert reliable the basic structures of almost every finite element mesh. Limitations of the method will be outlined below. The paper first defines finite element meshes in Section 2 then the paper describes a classification of the different finite element mesh formats in Section 3. Based on this classification a method has been developed to convert the geometry and the topology of E-mail address:
[email protected] 0965-9978/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.advengsoft.2012.05.002
a finite element mesh as presented in Section 4. Actually three sets of information are converted: coordinates of the nodes, nodes of the elements and the shape of the element. This latter information is also important as for example four nodes can represent a quadrilateral or a tetrahedron element. Generally the developed methodology cannot deal with the several aspects of the definition of boundary conditions, material assignments, load assignments, etc. but Section 4.4 will discuss a tagging system to partially solve this problem. The conversion method is based on text processing using regular expressions (Section 4.1), therefore only finite element mesh files in an ASCII format will be considered. Binary file formats will not be considered in this paper. Finally Section 5 will discuss examples and Section 6 will conclude the paper. 2. The definition of the finite element mesh The solution of a continuous problem that can be described by partial differential equations cannot be solved over a complex domain ðXÞ in most cases. This leads to approximations, for example geometric approximation, where the domain is discretized. This means that the complex domain must be divided up into smaller pieces in such a way that the sum of the solutions over these smaller pieces approximates the solution over the original domain. The discretized domain is usually called a finite element mesh or a grid. To define a mesh, first a triangulation of a domain [1] is defined. A triangulation ðTÞ can be defined for a set of points ðSÞ which defines the domain ðXÞ as: 1. The vertices of the triangles can be selected only from set S. 2. The triangles (simplexes) ðKÞ cover the domain ðXÞ, as S X ¼ K2T K
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
3. The interior of every element K is not empty. 4. The intersection of the interior of two elements is an empty set. 5. The intersection of two elements is either: an empty set; a vertex, an edge or a face. Unfortunately this definition is limited and a finite element mesh is usually defined with some relaxations of the above conditions. In the definition of a finite element mesh the following conditions still hold: S 1. The elements cover the domain ðXÞ, as X ¼ K2T K 2. The interior of every element K is not empty. 3. The intersection of the interior of two elements is an empty set. 4. The intersection of two elements is either: an empty set; a vertex, an edge or a face however the first condition of a triangulation is not assumed, since new points can be inserted or generated when the mesh is created; and equally important, element K is not assumed to be a triangle in this case. Other types of elements can make up the mesh. This definition may seem to be limited to the two dimensional case, but it can be extended to the three dimensional case where the domain is a volume and the elements have for example tetrahedron or hexahedron shape. Other common definitions exist where a ‘‘polygon mesh’’ is defined as a collection of vertices, edges, faces and polyhedra, where a vertex is a position in space, an edge is a connection between two vertices, a face is a closed set of edges and a polyhedra is a closed set of faces. This hierarchical definition of a mesh is presented in a much more general and mathematical way in References [2,3], where the author distinguishes ‘‘between a combinatorial grid (abstract complex) and its geometric embedding’’. Furthermore according to Berti [3] there is a strong relation between the elements of a grid and their lower dimension components, like faces, edges and points and these connections can be explored by ‘‘incidence operators’’. Two remarks must be made here. The hierarchical relation of the components with different dimensions in the mesh is important in this discussion, as in the classification of the meshes certain formats can be excluded based on the condition that this hierarchy is missing. On the other hand this paper will concentrate only on two levels of the hierarchy of the mesh components, the elements and the corresponding nodes. Edges and faces can be generated from these data therefore they are not considered in this discussion. 3. Classification of finite element mesh formats To handle as many mesh formats as possible it is important to classify the diverse mesh formats that exist for different engineering software. In the classification of finite element mesh formats two large groups can be identified: The information is defined in a block, one after the other one, consecutively. In this case the reference to the element of a block is either explicit by an identification value (number or string) or implicit by using the order of the information in the referenced and referring block.
21
The information is defined individually and later reference can be made to it by some identification value (number or string). Three kinds of information are considered mainly in this paper: the coordinates of nodes (geometry of the mesh), element nodes and element shapes (topology of the mesh). The format of the coordinates can be classified as: Coordinates that can be defined in a block, for example: COORDINATES id1 x1 y1 z1 id2 x2 y2 z2 id3 x3 y3 z3 ... Coordinates that can be defined individually, for example DEFINE POINT id1 x1 y1 z1 ... < some other definitions> ... DEFINE POINT id2 x2 y2 z2 ... In the case of the topology the elements must be defined in terms of the nodes. The current paper identified the following formats of the topology: The nodes of the elements and the element shapes are defined in two separate block sections, for example: TOPOLOGY id1 n1 n2 n3 n4 id2 n3 n4 n5 n6 ... idn n100 n101 n102 ... CELLTYPE id1 quad id2 tetrahedra ... idn triangle All the topology information, including the nodal information and the element shape, is defined in one block section consecutively, for example: CELLS id1 type1 n1 n2 n3 n4 id2 type2 n3 n4 n5 n6 ... Similarly to the coordinate formats, the nodes and shapes of the elements can be defined individually, where every line may describe a different element. The difference between this case and the previous one is that in this case the definition of the elements can be interrupted by any other information, while in the previous case the definition is continuous. This difference is important in the way of processing the elements. For example: DEFINE TRIANGLE id1 n1 n2 n3 ...
... DEFINE QUAD idn n4 n5 n6 n7 ... As a ‘‘degenerate’’ case it is possible that a set of coordinates directly defines an element topology. This is a ‘‘degenerate’’ case, because although the elements will cover the domain geometrically, there will be no connection between the elements. For example at a given position there will be as many nodes defined as there are elements are referring to it. An example format for this degenerate case is the STL format [4] which is discussed in Section 5.7.1:
22
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
facet normal ni nj outer loop vertex v1x vertex v2x vertex v3x endloop endfacet
nk v1y v1z v2y v2z v3y v3z
After identifying and establishing these basic formats, it is stated that the proposed method can convert finite element mesh files when the following conditions are true: The mesh format describes a single, connected domain in the mesh. Three types of information must exist in the mesh file as a minimum: node coordinates, element description by node numbers and element shapes. Node numbers must refer to node coordinates. This information is organized into identifiable ‘‘sections’’. The beginning and end of each section must be identifiable by some ‘‘marker’’. The example section of this paper will show that this definition includes most of the finite element mesh formats. The method also can handle degenerate cases in a limited way. One limitation is, for example, when there are no markers in the mesh file. In this case no general format description can be specified only a specific one for a single case. The following section will describe the method to convert mesh files as defined in this section.
expressions and for more detailed explanation References [7–9] should be consulted: Literal characters: when the character itself is matched against or searched in the text. Non-printable characters: for example a tabulator (‘nt’) or a new line character (‘nn’) is searched in the text. Character set: when one of the characters in the set is matched against the text and they are defined between square brackets. For example: the expression [a-z] will match any one character which is a small letter. Negative character set: when none of the characters in the set should match the text. They are also defined between square brackets but they start with the ‘‘hat’’ character, for example: the expression ½ b a z will match any character that is not a small letter. Quantifiers: when the preceding character is repeated by certain times, for example: – ‘⁄’: the preceding character can be repeated zero or more times – ‘+’: the preceding character can be repeated one or more times – ‘?’: the preceding character can be repeated once or not appear at all. For example the expression a + will match one or more small ’a’ letters. Alternatives can be specified when the regular expressions are separated by the ‘‘vertical line’’ character (‘j’). There are anchors, which can position the regular expression. For example the ‘b’ character positions the search to the beginning of the line, while the ‘$’ character positions the search to the end of the line.
4. The description of the developed method The developed method offers a way to convert finite element meshes to a single format. The conversion is not automatic in the sense that the method does not recognize the input mesh format automatically, but the mesh format must be specified by the user for the conversion method. Fig. 1 shows the schematics of the operation of a program implementing the method presented. The new contribution of this paper is the unified way to specify and describe the mesh file format with regular expressions. As this format describes the mesh, this can be viewed as a mesh file metaformat. The output format follows the e2-Library format [5]. In this paper only one output format will be considered as this paper concentrates on how to read and parse different finite element mesh files. 4.1. Regular expressions Regular expressions [6–8] are a powerful way to search for text. They are basically strings and they specify search patterns. In other words a regular expression is a pattern that describes a certain amount of text. A regular expression can contain different elements. The following is a short list of the features of the regular
Fig. 1. Schematic description of the conversion program.
It is important to note that regular expressions are ‘‘greedy’’, which means that they try to match as much text as it is possible. There are several implementations of the regular expressions, however in this paper the PCRE implementation [10] is used. The main reason to use the PCRE implementation is that it can reliably handle the new line character. Some of the mesh file formats are line-oriented, where the information is separated by the new line character, but some of the mesh formats are stream-oriented where the new line character has no special meaning. Unfortunately quite a few implementations of the regular expressions handle the new line character specially and therefore it is not possible to use them for stream-oriented mesh file formats.
4.2. Syntax of the mesh file metaformat According to the established classes of the finite element mesh files given in Section 3 a mesh file metaformat file has been defined. This file is the key to the mesh conversion, as this metaformat defines in a unified way how the ‘‘sections’’ in the mesh file must be found and how they should be interpreted. The metaformat will be described here by a context-free grammar [11], but using the bison [12] representation or syntax. In the description, the words written in small letters are non-terminal symbols, which are made of ‘‘smaller’’ components. The words written in capital are terminal symbols or tokens, which can be found as-is in the mesh format files. In this section only the most important parts of the grammar will be presented, not the full grammar, for the sake of brevity and easier understanding. The full grammar contains 67 rules and approximately 2000 lines which would be too long to present in this paper. The parts of the grammar file, like string, integer number, double number and so on, will not be explained here as the paper uses their natural meaning.
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
The mesh file metaformat can contain three parts: a header, a description of the format of the node coordinate data and the description of the format of the topology. The syntax can be seen in Table 1. The table shows two possibilities. Line 2 is included to handle the degenerate cases as discussed in Section 3 while line 3 is the standard mesh file metaformat when coordinates and separate topology data is available in the mesh file. In the header section of the format only one piece of information is specified, whether the number of the first node is zero or one (or in some very rare case a different number). This information is very important when the relation between the node coordinates and the topology must be established. The format of the header is:
NODE INDEX BASE n; where n is an integer number. The nodes and coordinates can be defined by the grammar shown in Table 2. It can be seen that in the case of coordinates, the different syntaxes are quite similar. In all cases the COORD_KEYWORD_START line defines how the method can identify the beginning of the section containing coordinates. The expressions mean one or several regular expressions. It is important that these regular expressions match all the text in the mesh file until the coordinate data starts. The expressions after the COORD_KEYWORD_START may contain not only regular expressions but templates as well. All templates are case insensitive and to denote a template they start with a dollar sign and end with a dollar sign. An example for a template is $NNODE$ which is equivalent to $nnode$. The templates in the specification are actually marking the position where certain information must be found in the mesh file. For example:
23
format of the information describing a coordinate data. In the case of free format any regular expression or template can be specified. The fixed format means that all fields in the coordinate definition have a fixed length, for Example 8 characters. Because the length of the data is fixed, in this case only templates are allowed in the specification. An example for the free specification is the following:
COORD FIELDS $FREE$ $ID$ ” ” $X$ ” ” $Y$ ” ” $Z$ ” n n” where the coordinate data starts with an identification number ($ID$), followed by exactly one space. The identification number is followed by the x ($X$), the y ($Y$) and the z ($Z$) coordinates, where the coordinates are separated by exactly one space. The end of the coordinate data is marked by a newline character. It is important that the regular expressions are correctly formulated, so the data will be recognized. For example in the previous case if the number of spaces between the x and y coordinates can be one or more spaces or even a newline, then the regular expression between the $X$ and $Y$ templates must be ½ nt n r n n. If the regular expressions are not specified correctly then the coordinate information will not be recognized and most likely an error will be generated. An example for the fixed coordinate specification is the following:
COORD FIELDS $FIXED$ $ID$ð4Þ $X$ð8Þ $Y$ð8Þ $Z$ð8Þ where the same kind of data is read as in the previous example, but all data has a fixed length. The identification number consists of 4 characters and the coordinates consist of 8 characters. Two examples that can be recognized by this fixed format specification are shown next. The first two lines are only displaying the format.
COORD KEYWORD START \Coordinates½ nt þ " $NNODE$\½ nt n r n n þ "
This specification means that at the beginning of the coordinate section in the mesh the Coordinates text must be found, followed by one or more spaces or TABs. After this text an integer number must be read, which specifies the number of nodes in the mesh file. Finally the integer number is followed by one or more spaces, TABs or newlines. Templates are not mandatory in the mesh file metaformat. On the other hand when a template is specified then it must be possible to read in the specific information from the mesh. This information is used to check and control the conversion process. The definition of one node can be specified after the COORD_FIELDS keyword. The expressions after the keyword must fully specify the coordinates. It means that the given pattern must be repeatable for nodes after nodes. The list of expressions after the COORD_FIELDS keyword can have two formats: free or fixed. This difference is marked by either of the two templates: $FREE$ or $FIXED$. These two templates are exceptions, because they do not denote any information, their sole purpose is to define the
Table 1 The starting symbol of the grammar of the mesh format file.
where the first node has the identification number 1 and its coordinates are all zero. The second node has the identification number 1234, its x coordinate is 2.22, its y coordinate is 1 and its z coordinate is 0.5. This format is traditionally used with FORTRAN77 programs. The COORD_FIELDS_TERMINATOR keyword is provided so the end of the data field can be explicitly looked for. The string after the keyword is optional for the free format, however it must be specified for the fixed format. In the case of the fixed format it must be specified because in the field definition it is not possible to define a regular expression, as was mentioned earlier. The end of the coordinate section will be recognized by the list of expressions given after the COORD_KEYWORD_END keyword. These expressions can be only strings, describing a regular expression and no template can be specified here. The method works in such a way that it tries to find the beginning of the coordinate section by looking for the full pattern given
24
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
Table 2 Format specification of the coordinate section of a finite element mesh.
after the COORD_KEYWORD_START keyword. Once the beginning of the coordinate section is found the other expressions after the COORD_FIELDS keyword will be matched with the data given in the mesh files. This data will be the first node in the mesh. If an $ID$ template was specified in the format, then this number is stored with the coordinates so later on it can be used to identify the node and also to check the validity of the conversion of the mesh. If it is an individual node definition then the COORD_FIELDS_TERMINATOR expression must be found and the expressions after the COORD_KEYWORD_START keyword will be searched again in the mesh to find a new node definition. On the other hand if the coordinates are defined in a block, then after reading the coordinate data, specified by the expressions after the COORD_FIELDS keyword, then the following steps have to be performed: If the COORD_FIELDS_TERMINATOR expression is specified then it must be found in the mesh file at the end of the data. Next it should be checked whether the data in the mesh file is matching the expressions given after keyword COORD_KEYWORD_END. If there is a match no more coordinates are read.
Finally, if the end of the coordinate block was not found then there is another attempt to match the expressions after the COORD_FIELDS keyword with the data given in the mesh files. If there is no match, it is an error. If data was read in then these three steps are executed again. The metaformat for the topology is similar to the metaformat for the coordinates. The main format can be seen in Table 3, while Table 4 shows the corresponding symbols of the grammar. There is a keyword which identifies the type of the format, then the expressions after the TOPOLOGY_KEYWORD_START keyword specify how the beginning of the topology section can be found. An expression can also be a string or a template, however after a keyword TOPOLOGY_KEYWORD_START the only meaningful template is $NELEMENT$, which denotes the place where the number of elements can be found in the mesh file. The symbol topology_fields represents one or several element specifications. One element specification starts with the keyword TOPOLOGY_FIELDS, which is followed by a line describing the format of the topology of one element. In this case it is also possible to have a free or fixed format as was described for coordinates. On the other hand in these expressions the templates can be
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
25
Table 3 Format specification of the topology of the finite element mesh file.
the integer node numbers, therefore the templates in this section can be: $n1$, $n1$, . . ., $n10$. Currently the maximum number of nodes for an element is 10. Besides the node numbers three other templates can be accepted in the expressions: $ID$, denoting the element identification number, $NNODE$ denoting the number of nodes for the element and $TYPE$ denoting the type of the element. Only for the degenerate case (TOPOLOGY_BY_COORDINATES) the coordinate templates are also accepted: $X1$, $Y1$, $Z1$, $X2$, . . ., $X10$, $Y10$, $Z10$. An example for a free format expression in the TOPOLOGY_FIELDS section is the following:
$FREE$ $ID$ ” ” $NNODE$ ” ” $N1$ ” ” $N2$ ” ” $N3$ ” n n” In a TOPOLOGY_FIELDS section there must be one line specification as described previously. This line specifies the input from the mesh file. The input line can be followed by one or several output line specifications. The output line specification starts with a
number which denotes the type or shape of the element as shown in Table 5. The number is followed by the node numbers as templates, for example:
4 $n1$ $n2$ $n3$ This type of specification allows element type conversion on the fly. For example when quadrilateral elements are read from a mesh file, they can be converted immediately to triangles in the following way: TOPOLOGY_FIELDS $FREE$ $ID$ ’’ ’’ $NNODE$ ’’ $N4$ ’’nn’’ 4 $n1$ $n2$ $n3$ 4 $n1$ $n3$ $n4$ TOPOLOGY_FIELDS_END
’’ ’’ $N1$ ’’ ’’ $N2$ ’’ ’’ $N3$ ’’
26
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
Table 4 Auxiliary symbols for the format specification of the topology of the finite element mesh file.
Table 5 Types of elements. Number
Type of element
1 4 8 12 14
2-point 3-point 4-point 4-point 8-point
line element triangle element quadrilateral element tetrahedron element hexahedron, block element
where one input line is specified to read four node numbers and the two output lines specify two triangles. When the topology is defined by a block structure then actually two sections are present in the finite element mesh. One section represents the nodes of the elements and the other section represents the shape of the elements, in this paper called celltype. In this case in the topology section the keyword CELLTYPE_KEYWORD_START specifies the expressions that identify the section with the shapes of the elements. Similarly to the previously described specifications the keyword CELLTYPE_FIELDS specifies the format of the data in the celltype section of the mesh file. At the end of the celltype data the string specified by the keyword CELLTYPE_FIELDS_TERMINATOR must be found, while the end
of the section of the celltypes is denoted by the list of expressions specified after keyword CELLTYPE_KEYWORD_END. In the celltype section of the mesh the most important data is represented by the template $TYPE$, which must be specified after keyword CELLTYPE_FIELDS. This template can be a number or a string, as the different finite element mesh file formats demand. To provide as much checking as possible during the reading of the finite element mesh, the type that is read in the section of the celltype must match the type that is specified as the first element in the input line of the TOPOLOGY_FIELDS. The specification of the type in the topology can be seen in Table 4 in line 8. Table 6 shows an example imaginary mesh file with block definition and Table 7 shows the corresponding format specification for this mesh file format. In Table 7 lines 10 and 15 should be observed as these lines define the strings that occur in the mesh file in Table 6 between lines 9 and 12 (inclusive). In this example the method first tries to match the expressions after the keyword TOPOLOGY_FIELDS with the mesh, so the first five integers separated by a single space character in the line will be sought. If there is a match then the type specified in the input line of the TOPOLOGY_FIELDS is compared to the type that was read in the section of the celltypes. If the two types do not match, then it is not accepted as a valid interpretation of the data in the mesh file and the input line of another TOPOLOGY_FIELDS section will be matched with the read data. This is the reason why the example
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
27
Table 6 An imaginary mesh file format with block definition.
Table 7 Format specification for the imaginary mesh file format in Table 6.
presented in Table 7 will not generate any tetrahedron element, but all elements will be correctly recognized as quadrilateral or triangle elements. In Table 3 it can be observed that the format topology_by_ keyword and topology_by_individual are almost identical, except for the starting keyword. Although the format of the two specifications is similar they are different in the way that they process the finite element mesh file.
nodes and elements – are stored in a memory structure. This memory structure was developed for the e2-Lib programming library [5]. Once the parsing is finished and the memory structure is filled, the ASCII file format of the e2-Lib is used to write out the mesh. 4.3.1. Conversion of the coordinates The following list describes the steps to read the coordinates of a mesh where the coordinates are defined in a block. This format was described between lines 6–11 in Table 2.
4.3. The process of converting the finite element meshes This section will describe how the finite element mesh is parsed according to a metaformat. During the parsing phase, the data –
1. Open the finite element mesh file. 2. Find the starting keyword and the list of expressions which starts the coordinate data block.
28
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
3. Read in data as specified in the mesh file metaformat. Check also whether there is a terminator at the end of the data. 4. If ID (identification number) was specified coordinate ID = read ID – NODE_INDEX_BASE else coordinate ID = generated, continuous, local ID 5. Add point coordinates with the determined ID number to the output finite element mesh. 6. Try to find the keyword which ends the coordinate data block. 7. If neither the block end keyword nor the data terminator was found, it is an error. 8. If the block end keyword was found then it is the end of coordinate block reading. 9. Otherwise goto step 3. The above steps are partly similar to the list of steps which describe how to read the coordinates of a mesh where the coordinates are defined individually. This format was described between lines 14–18 in Table 2. The individual definition basically means that they can occur anywhere in the file and they have a meaning, individually. The consequence of this property is that the starting keyword for the coordinates is searched repeatedly in the file. 1. Open the finite element mesh file. 2. Find the starting keyword and the list of expressions which starts the coordinate data section. 3. If the starting keyword is not found then there are no more points in the mesh file and this is the end of the conversion of coordinates. 4. Read in data as specified in the mesh file metaformat. Check also whether there is a terminator at the end of the data. If no terminator is found at the end of the data section then it is an error. 5. If ID (identification number) was specified coordinate ID = read ID-NODE_INDEX_BASE else coordinate ID = generated, continuous, local ID 6. Add point coordinates with the determined ID number to the output finite element mesh. 7. Goto step 2. 4.3.2. Conversion of the topology The reading and processing of the topology is more complex. To explain the processing in a simple way the discussion of the algorithm has been divided. The following list of steps will describe the core algorithm of the conversion of the topology. These steps can consider several different element descriptions. 1. Consider the first element description, specified by TOPOLOGY_FIELDS. 2. Try to read in the data as specified in the format. (For example it is important that the specified number of nodes are read.) Check also whether there is a terminator at the end of the data. 3. If the read data matches the specification goto step 5. 4. If the read data do not match, then consider the next element description, specified by the next TOPOLOGY_FIELDS and goto step 2.If no more TOPOLOGY_FIELDS then goto step 11 5. If ID (identification number) was specified element ID = read ID else element ID = generated, continuous, local ID
6. If celltype is available check whether it matches the available data. Use element ID to find the corresponding celltype. 7. Convert element node numbers to stored coordinate IDs and add the element to the output finite element mesh. 8. If we find the terminator then start over and goto step 1. 9. If we find the end keyword goto step 11. 10. If there is no terminator and no end keyword at the end of the data section then there is an error and stop the conversion. 11. The end. The following list describes the different conversion methods using the core algorithm. Definition by block: The finite element mesh file is opened twice, first for the celltype definition then for the topology. 1. Open the finite element mesh file. 2. Find the starting keyword and the list of expressions which start the celltype data block. 3. If the starting keyword is not found then there are no celltypes in the mesh file and this is an error. Stop the conversion. 4. Read in the celltype data according to CELLTYPE_FIELDS. If the format specification and the read data do not match it is an error. Stop the conversion. 5. If we find the terminator then goto step 8. 6. If we find the end keyword goto step 11. 7. If there is no terminator and no end keyword at the end of the data section then there is an error and stop the conversion. 8. If ID (identification number) was specified cell ID = read ID else cell ID = generated, continuous, local ID 9. Add celltype with the determined ID number to a table in the output finite element mesh. 10. Goto step 4. 11. Position the file pointer to the beginning of the mesh file. 12. Find the starting keyword and the list of expressions, which start the topology data block. 13. If the starting keyword is not found then there are no topology elements in the mesh file and it is an error. Stop the conversion. 14. Use the core algorithm for the conversion of the topology. 15. Close the mesh file. Definition by keyword: This definition is a mixture between the definition by block and the individual definition. In this case several format specifications can be defined, one for each element type. On the other hand one format specification is like a block definition. For each element type the finite element mesh file is opened separately. 1. Open the finite element mesh file. 2. Consider the first TOPOLOGY_BY_KEYWORD section. 3. Position the file pointer to the beginning of the mesh file. 4. Find the starting keyword and the list of expressions which start this topology data section. 5. If the starting keyword is not found then this type of topology cannot be found in the mesh file and goto to step 7. 6. Use the core algorithm for the conversion of the topology. 7. If there are more TOPOLOGY_BY_KEYWORD goto step 3, otherwise close the mesh file and this is the end. Individual definitions: This case is similar to the definition by keyword case with one important difference, that an element topology specification is repeatedly searched in the file. 1. Open the finite element mesh file. 2. Consider the first TOPOLOGY_BY_INDIVIDUAL section.
29
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
3. Position the file pointer to the beginning of the mesh file. 4. Find the starting keyword and the list of expressions which start this topology data section. 5. If the starting keyword is not found then this type of topology cannot be found in the mesh file and goto to step 8. 6. Use the core algorithm for the conversion of the topology. 7. Goto step 4 8. If there are more TOPOLOGY_BY_INDIVIDUAL goto step 3, otherwise close the mesh file and this is the end. It is important to note that the step with bold typeset is the main difference between the processing of the individual definition and the definition by keyword. Topology definition by coordinates: In this case the finite element mesh file is opened only once and the defined specification is repeatedly searched in the file.
1. Open the finite element mesh file. 2. Find the starting keyword and the list of expressions which start this topology data section. 3. Read in the nodes of one element. Store the nodes and the elements in the output mesh. 4. The end expression must be found in the file after the element topology. 5. Go to step 2 until the end of the file is reached. During the parsing steps it is possible to reposition the file pointer, as is indicated in the text or alternatively the finite element mesh file can be opened and closed several times. Finally it is important to detail how the node numbers in the topology and in the coordinate sections are matched. If the $ID$ template is specified for nodes in the coordinate section, then the identification number of the nodes can be arbitrary. In this case
Table 8 Time measurements of the conversion process. Format
Waveform OBJ VTK LSDYNA
Number of coordinates
18,076 12,167 16,547
Table 9 A part of an unstructured finite element mesh in VTK format.
Number of elements
28,516 10,648 63,414
Element types
Triangle, quadrilateral Hexahedron Hexahedron
Conversion time for (s) Coordinates
Elements
1.77 1.10 2.43
7.84 19.52 87.17
30
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
Table 10 Mesh format specification for an unstructured mesh in VTK format.
the number specified by keyword NODE_INDEX_BASE is subtracted from the value read from the mesh to convert it to a zero-based index. If no explicit identification number was read for a node in the coordinate section, then a local node number is generated, which starts from zero and is incremented by one after each read node. In either case the identification number of the node is stored in a vector. The position of the identification number in the vector can be thought of as the continuous local number and the stored number is coming from the mesh. This structure makes it possible to handle two special situations: When the node numbers are not explicitly set in the file, then a zero-based, local node number is generated and the topology will refer to these numbers. In this case there is no real use of the vector, because the stored ID number and the position of the ID number are the same. When the node numbers are not consecutive in the mesh, then the stored ID numbers will have ‘‘gaps’’, however the position of the ID number in the vector will be still consecutive. The element node numbers will refer to the stored ID numbers, but in the output mesh the position of the stored ID number will be used as that number sequence is consecutive. In the output mesh the identification numbers of the nodes are determined from their position in the vector, therefore the numbering of the nodes is consecutive. In the topology section the value specified by keyword NODE_INDEX_BASE is subtracted from all node numbers and this modified node number must be located in the vector of identification numbers of nodes. If the identification number for the node
cannot be found in the vector then it is a fatal error and the conversion process is terminated. When the identification of the node is found in the vector then the position number is used as a node number in the topology. 4.4. Tagging nodes and elements For a simulation or a numerical analysis extra information is assigned to parts of the finite element mesh. For example material is assigned to the elements and boundary conditions to the nodes. There are several ways to represent this assignment, however the method described in this paper cannot handle all these cases.
Fig. 2. Picture of the unstructured mesh converted from VTK unstructured grid [14].
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
Instead of a complex representation system a simple tagging technique is used to try to keep the assigned information after the conversion. It can be observed in most finite element mesh formats that the extra information is usually specified in the same line with the coordinates or nodes and mostly some identification number is used. The current method utilizes this observation and the above described mesh format specification allows that ‘‘tags’’, which are integer or double numbers, can be read along with the coordinate or nodal data. In the currently described method it means that in the mesh file metaformat after keyword COORD_FIELDS and TOPOLOGY_FIELDS in the formatted_expression list two other templates can appear: $INT$ and $DOUBLE$. These integer and double values are stored with the corresponding node or element and they are written out into the output mesh. For example the following format specification:
Table 11 A polygon based mesh in VTK format.
Table 12 Mesh format specification for a polygon based mesh in VTK format.
31
COORD FIELDS $FREE$ $X$ ” ” $Y$ ” ” $Z$ ” ”$INT$ ” ” $INT$ ” n n”
can read the coordinates and tags written in a mesh file like this: 0.0 1.0 0.0 1.0
0.0 0.0 1.0 1.0
0.0 0.0 0.0 0.0
1 1 1 1
1 2 3 4
If these tags are to be used later, then after the conversion process some other steps must be performed, so the required information can be assigned to the node or element according to the value of the tag.
32
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
5. Examples The full description of all of the mesh file formats is beyond the scope of this paper. In this paper only the most important features of the particular mesh file format will be presented. For full details of the specific format the provided references must be visited. The presented format specifications are also special in the sense that they are valid to read the particular finite element mesh file, however it is possible that some minor adjustment in the regular expressions are required for some other finite element meshes in the same format. For example in one file only one space is used between node coordinates, however in other files TABs are used between node coordinates. These types of differences may not be represented in these examples for the sake of brevity.
Fig. 3. Picture of the mesh converted from a polygon based mesh in VTK format [14].
4.5. Implementation issues A program implements the above described method. The program uses the following Open Source components: the regular expression library: PCRE, version 8.0, released on the 19th of October, 2009; Flex, fast lexical analyzer generator: version 2.5.35; Bison, parser generator: version 2.4.2. From the syntax, described in Section 4.2, the Flex and the Bison programs generate two files. The Flex program is invoked with the -i argument, thus the generated, lexical analyzer will be case insensitive. The options for the Bison program are: -d -y. Using the -y option enforces the Bison to emulate the POSIX yacc program, while the -d option generates a header file for the parser. All components have been written in the C programming language therefore the whole program can be easily ported to any operating system and environment. The program has been tested under the Linux and Windows operating systems and Table 8 shows some timing data about the performance of the compiled program under Linux operating system. From the table it can be seen that the relationship between the conversion time of the coordinates and the number of nodes is somewhat linear. On the other hand, in the case of the element conversion the type of the element is important. The conversion of the hexahedron element is more time consuming than the conversion of triangles and quadrilaterals. Table 13 An example LS-DYNA file.
5.1. VTK format From the point of view of the presented method there are several mesh formats in the VTK toolkit [13]. In this section two different meshes will be presented which can be found in the distribution of the VTK toolkit [14]. The first mesh (blow.vtk ) contains an unstructured grid of triangles and quadrilateral elements and a part of the mesh can be seen in Table 9. In the definition the triple dots mark that some part of the mesh has been cut for brevity. This mesh format is a typical example for the definition by block format with a separate celltype block. The format description of the mesh is shown in Table 10 and the picture of the converted mesh is shown in Fig. 2. The other example mesh (fran_cut.vtk ) in the VTK format contains polygons. Actually the polygons in the mesh are three sided, therefore it is easy to convert them to triangle elements. Table 11 shows a part of the input mesh. It can be seen that in this case the coordinates are still defined by a block section, however the elements are defined by a keyword. This mesh format can be an example for topology definition by keyword. Table 12 shows the format description and Fig. 3 shows the picture of the converted mesh. 5.2. LS-DYNA format The finite element mesh for the LS-DYNA software [15] can be in a free or fixed format. In the free format the fields are separated by commas, for example: *NODE 111,x,y,z 112,x,y,z
P. Iványi / Advances in Engineering Software 51 (2012) 20–39 Table 14 Mesh format specification for the LS-DYNA format.
Table 15 An example VRML1 file [18].
On the other hand in this section the fixed format will be discussed. In the fixed format the number of characters that can be used for a data field is specified. For example the format of the previous node definition in fixed format is the following:
Fig. 4. Picture of the mesh converted from the LS-DYNA format.
33
34
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
Table 16 Mesh format specification for format VRML1.
NODE 111
x
y
z
112
x
y
z
where the node identification number can contain 8 characters and the coordinates can contain 16 characters. Table 13 shows a part of an LS-DYNA finite element mesh which was used in Reference [16]. Table 14 shows the format description which can convert the LS-DYNA file to e2-Library format. It is important to point out that in the second TOPOLOGY_FIELDS node $n3$ is used twice in the input line. This is a specification for a triangle element in LS-DYNA. The developed method can recognize this format via the template mechanism and although every line in the mesh contains 4 node numbers for each element, the program can correctly recognize triangle elements where the last node is repeated. The example file after the conversion can be seen in Fig. 4.
5.3. VRML 1 format The VRML (Virtual Reality Modeling Language) [17] format was developed in the 1990s. The format was based on the Open Inventor ASCII file format from Silicon Graphics (SGI). The initial name of the file format was Virtual Reality Markup Language as the HTML format has a great influence on it. The format was designed to be
Fig. 5. Picture of the mesh converted from a VRML1 format [18].
platform independent, extensible and it should have worked well over low-bandwidth network connections.
P. Iványi / Advances in Engineering Software 51 (2012) 20–39 Table 17 A part of an example OBJ file contributed by Viewpoint Animation Engineering.
Table 18 Mesh format specification for format OBJ.
35
36
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
Fig. 7. Picture of the converted LUSAS input file.
Fig. 6. Picture of the converted Wavefront OBJ mesh (created by viewpoint animation engineering).
Table 19 A part of an example LUSAS file.
Table 20 Mesh format specification for format LUSAS.
A new version of the format was proposed in 1997 as VRML97 or VRML2 and later became an ISO standard. The format has been managed by the VRML Consortium which later changed to the Web3D Consortium and developed the X3D format. The format describes the model in a hierarchical structure which creates a scene graph. The model description can be very complex, including transformations, materials and light definitions. These elements in the scene graph are not handled as usually they are not present in a finite element mesh. The method in this paper only considers VRML models which are similar to finite
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
37
Table 21 A part of an example STL file.
Table 22 Mesh format specification for the STL format.
element meshes and they contain an ‘‘IndexedFaceSet’’ section. Table 15 shows a part of an input file [18] generated by the KnotPlot [19] program written by Rob Scharein. The VRML file format is also interesting because it is a stream-oriented format where the new line character has no special meaning. Table 16 shows the format description which can convert the VRML file to the e2-Library format. The example file after the conversion can be seen in Fig. 5.
5.4. Wavefront OBJ format This file format was developed by Wavefront Technologies and is widely used by several software types. The file format, according to the classification of this paper, uses the individual definition. The coordinates of a point are defined in such a way that a line starts with the letter ’v’. A face, a triangle or quadrilateral element, is defined by a line starting with the letter ’f’. The node numbers can be followed by texture coordinates or by normals. An example mesh provided by Viewpoint Animation Engineering will be used here.
Table 17 shows a part of the input Wavefront OBJ file and Table 18 shows the format description which can convert the file to the e2-Library format. The example file after the conversion can be seen in Fig. 6. 5.5. LUSAS command format The LUSAS ‘‘file format’’ originates from the LUSAS finite element program software which was developed by the Finite Element Analysis Ltd. in the United Kingdom from 1970 [20]. The file that is used by the software is actually a command file. In the file, points and elements are defined individually which makes it also a unique example. Table 19 shows a part of the input LUSAS file. In this file the triple dots (. . .) are actually used in the definition of the elements and their meaning is different from the other examples. In this example mesh only line elements will be used. The mesh was used in Reference [16]. Table 20 shows the format description which can convert the mesh in LUSAS format to the e2-Library format. The example file after the conversion can be seen in Fig. 7.
38
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
5.7. Degenerate formats As was discussed in Section 3 some of the mesh file formats are considered to be degenerate cases, because they define disconnected elements. Two of these formats will be presented in this Section because after reading these meshes according to their format they can be converted easily to a connected mesh by searching for and removing the duplicate nodes. Duplicate nodes are nodes with coinciding coordinates or when the distance between the two points is very small. 5.7.1. STL format The STL file format [4] can represent surfaces made of unstructured triangles. The format is most commonly used in rapid prototyping, manufacturing software and in stereolithography. The format simply represents the geometry of the triangles, but no color, texture or other properties can be assigned to the triangle elements. There is a binary version of the file format but this paper considers only the ASCII version of this file format. Table 21 shows a part of an STL file which represents a sphere. Table 22 shows the format description of the STL mesh files. The example STL file after the conversion can be seen in Fig. 8. Fig. 8. Picture of the converted STL input file.
5.6. Other formats The method developed has been successfully tested also with the following finite element mesh formats, but they are not presented here: IDEAS universal file format, ANSYS command format, OpenDX format, Gambit format, Tetgen format, GMSH format, MEDIT format, Impact format.
Table 23 A part of an example DXF file.
5.7.2. AutoCAD DXF format The DXF format is a Drawing Exchange Format developed by Autodesk so the CAD software of the company can exchange data with other programs. The file format tries to be an exact representation of all possible objects in the AutoCAD software. The DXF file format has an ASCII version and this format is considered in this paragraph. In the file the information is divided into sections. Every data type is identified by a number which is followed by the data itself. Table 23 shows a small part of a DXF file. This data file was taken
P. Iványi / Advances in Engineering Software 51 (2012) 20–39
39
Table 24 Mesh format specification for the DXF format.
Acknowledgments The author would like to acknowledge the support of Ecole Centrale Paris, France and Metropolitan State College of Denver, Denver, Colorado, USA. References
Fig. 9. Picture of the converted DXF file.
from Reference [21]. The part in the table represents a part of a line object in the file. Table 24 shows a simple way to parse a DXF file containing line objects. Fig. 9 shows the converted mesh. 6. Conclusions The paper presented a classification of text file formats of finite element meshes. Following this classification a methodology and a program has been developed to parse and convert these finite element meshes to a single format. The reading and parsing is performed using regular expressions. The paper has described a format specification for these meshes and has shown successful conversion for several different finite element mesh formats.
[1] Frey PJ, George PL. Mesh generation, application to finite elements. Oxford: Hermes Science Publishing; 2000. [2] Berti Guntram. Gral – the grid algorithms library. Future Gener Comput Syst 2006;22(1–2):110–22. [3] Berti Guntram. General software components for scientific computing. PhD thesis, Brandenburgischen Technischen Universitat Cottbus; 2000. [4] http://www.ennex.com/fabbers/stl.asp. [5] Iványi Peter, Putanowicz Roman. The e2-library programming manual. Edinburgh, United Kingdom: Heriot-Watt University; 2010. [6] The Open Group. The Single UNIXÒ Specification, Version 2. . [7] Friedl Jeffrey EF. Mastering regular expressions. O’Reilly Media; 2006. [8] Goyvaerts Jan, Levithan Steven. Regular expressions cookbook. O’Reilly Media; 2009. [9] Goyvaerts Jan. . [10] Hazel Philip. PCRE – perl compatible regular expressions. . [11] Chomsky Noam. Three models for the description of language. Inform Theory IEEE Trans 1956;2:113–24. [12] Donnelly Charles, Stallman Richard. Bison, the YACC-compatible parser generator. Free Software Foundation; 1995. [13] Schroeder Will, Martin Ken, Lorensen Bill. The visualization toolkit: an objectoriented approach to 3D graphics. Kitware, Inc.; 2006. [14] Visualization toolkit. . [15] Livermore Software Technology Corporation. LS-DYNA, Keyword user’s manual, version 970 edition; April 2003. [16] Iványi MM, Bancila R, Iványi P, Iványi M. Stability and ductility of planar plated steel structures. Pécs, Hungary: Pollack Press; 2010. [17] Bell Gavin, Parisi Anthony, Pesce Mark. The virtual reality modeling language, version 1.0 specification; 1995. [18] Scharein Rob. Knot model created by knotplot; 1996. . [19] http://knotplot.com/. [20] http://www.lusas.com/the_company.html. [21] Arturo Perez Fernandez. Programming Victor Vasarely with autolisp. Master’s thesis, University of Pécs, Hungary; 2011.