• Recent Articles

    jsiwila

    Processing Records Rejected by a Read File Operator

    This is a companion article to Processing Rejected Records. That deals with records rejected by the Read Excel operator and applies to Read operators in other expressor Extensions and to the Read Custom operator.

    Parsing records rejected by the Read File operator is simpler than parsing records from Read Excel and related operators. The Read File operator retains the sequence of fields and attributes when it writes them to the RecordData field of the Reject Record Schema. Because of that, it is not necessary for the Read File operator to write header records to describe the field and attribute sequence. That greatly simplifies the datascript required to parse the RecordData field and reconstruct the rejected records.

    As explained in the Processing Rejected Records article, when a Read operator... read more
    jsiwila 05-14-2012, 03:33 PM
    jlifter

    expressor tutorials

    In this section you will find a collection of older tutorials that demonstrate the features of expressor Studio and the entire expressor Data Integration Platform. Though... read more
    jlifter 05-08-2012, 04:08 PM
    jsiwila

    Release Notes - expressor 3.6.0, 3.6.1, 3.6.2, 3.6.3, and 3.6.4

    expressor 3.6.4 fixes four bugs, three in Studio (STU-4728, STU-4730, and STU-4744) and one having to do with reading Informix databases from dataflows running on Linux (PRO-2634). See the resolved issues section below for a description of each of the fixed bugs.

    Also, see STU-4729in the known issues section below. STU-4729 describes an issue encountered when upgrading artifacts in a Repository Workspace.

    Note that the Informix ODBC drivers shipped with expressor software do not support Unicode on Linux.

    expressor 3.6.3 fixes three bug in Studio: one covering memory leaks, another dealing with binding to the Oracle NUMBER data type, and the third corrects unnecessary rounding when converting a double precision value to decimal. See STU-4641, STU-4625, and PRO-2622 under resolved issues below. Also see STU-4657 in the known issues section immediately following for a workaround to a problem dealing with NUMBER... read more
    jsiwila 05-08-2012, 09:02 AM
    jsiwila

    Processing Rejected Records

    When records produce errors because they violate constraints set on Composite Type
    attributes or other reasons, the operator that encounters the error can handle them
    by skipping them, aborting the dataflow, or rejecting the offending records. In some
    cases, it is sufficient to simply send rejected records to a Write File operator
    and examine the records in the output file. If the intent is, however, to correct
    or otherwise use those records, examining each error and changing the data could
    be very cumbersome. The more efficient approach would be to reprocess the records
    as they come out the reject port.

    Records rejected by input operators such as Read File, Read Table, and Read Excel are structured into the following fields:

    RejectType
    RecordNumber
    RecordData
    RejectReason
    RejectMessage

    The record data as it was constituted before being rejected is contained in the RecordData field. To process that data, it must first be reconstructed from the rejected record format. Several factors affect the reconstruction. The order of the record data fields can be different from the order represented in the original Schema, and some of the records emitted from the reject port do not contain record data. For example, RejectType 1 errors are constraint violations, but before they are emitted, a RejectType 4 record is emitted. The RejectType 4 record contains the record data field order for the subsequent RejectType 1 errors in its RecordData field. The RejectTypes are fully explained in the Using the Reject Port section of the Read Custom operator topic in the product documentation.

    Note: All non-input operators that have a reject port emit rejected records with the existing attributes of the record, that is, they do not restructure the records the way input operators do. Reprocessing records rejected by non-input operators do not have to be reconstructed
    ... read more
    jsiwila 05-07-2012, 10:00 AM
  • Release Notes - expressor 3.4.2

    expressor 3.4.2 is a 32 bit ETL application that can be deployed onto computers running the Windows operating system. expressor Studio should be installed onto computers running Windows XP or Windows 7, 32 or 64 bit operating systems. expressor Repository and expressor Data Processing Engine may be installed onto computers running the Windows Server 2003 or 2008, Windows XP, SP3, or Windows 7 Professional or Enterprise, 32 or 64 bit operating systems.The following list of search terms can help locate specific issues. Many of the terms have more than one reference in the listings below. The terms occur either within the text of the issue description or in a search term list appended to the issue description. Use your browser's Find box to locate the references.
    • validate attributes
    • large number of attributes
    • table schema
    • delimited schema
    • lookup keys
    • output attributes
    • dataflow step
    • earlier versions
    • error handling
    • datascript module reference
    • function rule

    What's new in expressor 3.4, expressor 3.4.1, and expressor 3.4.2

    expressor 3.4, expressor 3.4.1, and expressor 3.4.2 include the following new features and functionality.
    • Attribute Propagation: the behind-the-scenes transfer of data across transforming operators, which provides for simplified coding and higher throughput
    • Configuration Parameters: provides the ability to change operator properties, for example, database connection details, at runtime
    • Desktop Edition: a licensing upgrade to expressor Community Edition that offers a higher level of performance
    • Lookup Tables: an embedded relational database management system that supports creation and maintenance of localized data storage
    • Multi-step Dataflows: provide sequential processing of separate but related data processing tasks
    • New Operators:
      • Read Lookup Table
      • Write Lookup Table
      • Pivot Row
      • Pivot Column
      • Multi-test Filter Operator

    • Persistent Values: ability to pass values between operators within a dataflow and to retain values across sequential executions of a dataflow
    • Rules Editor: a graphical coding environment that makes it easy to exploit the power of Attribute Propagation and provides for more effective code reuse
    • Write Table Operator Merge Method: executes either an update or insert operation as appropriate
    • Parameter Management: ability to change operator properties at runtime

    Installation

    If you have used an earlier version of expressor Studio, be certain to back up your workspaces before installing expressor Studio 3.4.2. In your My Documents folder (or in whichever folder you stored your expressor workspaces), make a copy of the expressor folder. If you later decide to uninstall expressor 3.4.2 and re-install the previous version, you will need to delete any workspaces created with expressor 3.4.2 and return to the workspaces created with the prior version.Note: Before installing expressor Studio on Windows XP, install hotifx KB943326. Before installing expressor Studio on any platform, install hotfix KB967328.If you want to completely remove prior installations and all workspaces and project artifacts:
    • Use the Windows Control Panel utility to uninstall expressor Studio.
    • Delete the directories:
      • Windows 7:
        • C:\Usersusername\AppData\Roaming\expressor
        • C:\Users\username\AppData\Local\expressor
        • C:\Users\username\AppData\Local\expressor_software

      • Windows XP:
        • C:\Documents and Settings\username\ApplicationData\expressor


    • Discard the download file expressorStudioInstaller.exe from prior installations.
    • Delete, or rename, the Workspaces directory.
      • Windows 7:
        • C:\Users\username\Documents\expressor\Workspaces

      • Windows XP:
        • C:\Documents and Settings\username\My Documents\expressor\Workspaces



    The following known issues have been identified.

    1. STU-3194
      Rules Editor takes up to 20 seconds to open when the operator contains a rule with a large number of input or output parameters. [large number of attributes]
    2. STU-3189
      When using Auto Generate to mape a Schema to Composite Type attributes in the Schema editor, the mapping lines do not appear even though the Schema shows that a change has been made and not saved. The Schema must be saved, closed, and reopened for the mapping lines to become visible.
    3. STU-3186
      Rules Editor's type-ahead feature does not recognize changes to input parameters. It displays old input parameter names instead of the current input parameter names. [validate attributes]
    4. STU-3170
      A new Table Schema created from a Type is placed in the first project in the workspace, regardless of which project contains the Type used to create the Schema.
    5. STU-3167
      When selecting a large number of attributes in the Rules Editor with the Select All option on the Edit toolbar, only the nonvisible attributes are highlighted. Until you scroll, it appears no additional attributes were selected.
    6. STU-3166
      Validation of Required attributes (those propagated upstream) does not work. To workaround this issue, set the Schemas in both the input and output operators, then disconnect both the input and output links to the transformation operator and reconnect them. This resets all the output attributes. [validate attributes]
    7. STU-3112
      In the Rules Editor, selecting and dragging a large number of output attributes to a rule's output parameters takes a long time to process.
    8. STU-3106
      Noticable delay when opening the Rules Editor for an operator that has a large number of input attributes. [large number of attributes]
    9. STU-3085
      Changes to dataflows are not indicated in Deployment Packages the dataflows are contained in.
    10. STU-3066
      Performance is slow when large number of output attributes are selected and mapped simultaneously to a rule's output parameters.
    11. STU-3063
      The New Table Schema from Upstream Output option on the Schema property in the Write Table operator presents the wrong Schema wizard. It presents the New Table Schema from Type rather than the New Table Schema from Upstream Output. [table schema]
    12. STU-3044
      Duplicate field or column names might cause wizards for New Delimited Schema and New Table Schema to stop at the naming step. The error message indicating that a duplicate name has been entered is not always visible. [table schema, delimited schema]
    13. STU-3033
      Save as Template dialog box saves last operator template rather than current operator template. The dataflow must be saved, closed, and reopened to save another operator template.
    14. STU-3024
      Lookup Table key names containing space characters are not flagged as invalid until the dataflow fails. [lookup keys]
    15. STU-3013
      Data types for output attributes are not adjusted appropriately when an upstream operator is disconnected.
    16. STU-2968
      Saved dataflows marked as changed when reopened.
    17. STU-2951
      Studio does not always activate the most permissive license installed; it sometimes activates the Studio-only license even though a more permissive license has been installed.
    18. STU-2776
      Join operator in Studio 3.4 does not always run as expected in dataflows created with earlier versions of Studio.
    19. STU-2773
      Input attributes are not available in the Function Rule datascript editor if the Function Rule was created by conversion from another type of rule. Even if inputs are added manually to the rule, they do not appear when typing datascript. However, all works fine if the Function Rule is created from scratch.
    20. STU-2728
      There is no validation to ensure Lookup Tables have at least one attribute that is not a key. However, if a Lookup Table does not have at least one attribute that is not a key, then it contains nothing to lookup. [lookup keys]
    21. STU-2291
      A validation warning is not issued when Error handling is set to Reject and the reject port is not connected.
    22. STU-2163
      No notification of lost Datascript Module reference. When a Projectís Library Reference is removed and the Library contains a Datascript Module that is used by an open dataflow, no validation error is displayed. The broken reference to the Datascript Module is not manifest until the dataflow runs and fails.
    23. STU-1942
      When defining ìallowed valuesî for a String data constraint, must include default values in list of allowed values.
    24. STU-1566
      Quote character, field delimiter, and record delimiter cannot be the same in a delimited schema, but validation of the schema does not fail if they are the same. The characters used as the quotation mark and the field and record delimiters cannot be the same. This restriction is documented in the Create Delimited Schema topic in online help. But violation of this restriction is not indicated when the settings are specified for the schema. The conflicts will, however, cause an error when the dataflow runs.
    25. PRO-2296
      The eflowsubst command's -O option places the substitution file in the external directory under the Deployment Package instead of the Deployment Package's dataflow directory.
    26. PRO-2294
      Running a dataflow containing a Write Parameters operator in a Free Studio Edition produces vague error message. Error message should indicate that the Write Parameters operator cannot be used with Free Studio.
    27. PRO-2287
      When Generate record is chosen as the On Miss action in a Lookup rule, a value must be supplied for all the output parameters in the rule, even those that are not mapped to output attributes. To work around this, a meaningless value can be assigned to the output parameters that are not mapped to output attributes.
    28. PRO-2262
      When Datascript Module cannot be found, error messages do not indicate clearly or in a timely manner that the require statement cannot be executed. [datascript module reference]
    29. PRO-2259
      The Unique key value in a Lookup Table gets changed even after an error indicates that changing a Unique key value is not allowed. [lookup keys]
    30. PRO-2250
      When writing an oversized string to a Teradata database with the batch size set to the default (4096), the Reject Record error handling option does not send the oversized record to the reject port.
    31. PRO-2246
      The Write Table operator's Reject Record error handling does not work when it is connected to an Oracle database and Merge Mode has been set.
    32. PRO-2232
      The Write Table operator's Reject Record error handling does not work when it is connected to a DB2 database and Merge Mode has been set.
    33. PRO-2220
      The eflowsubst command overwrites an existing substitution file without warning.
    34. PRO-2140
      When a dataflow created with expressor Studio Version 3.3 is opened in Version 3.4, the comment blocks before and after created in the Transform Editor are visible in the Rules Editor, though they are meaningless in the Rules Editor.
    35. PRO-2028
      Reject options for Error handling do not work when a Write Table operator is connected to an Informix database.
    36. PRO-1812
      Cannot write a nil value to a Teradata long varchar column.
    37. PRO-1655
      Bulk load mode does not work when writing to a Sybase database.
    38. PRO-1925
      No error is generated when reading in a decimal that contains more digits than the internal representation can handle.
    39. PRO-1628
      Decimal columns in Informix databases import to Read Table operator as SMALLFLOAT data type (was Bug 5659).

    The following issues identified in earlier product releases have been resolved.

    1. STU-3231
      Setting a rule to Disabled in a Filter operator does not disable the rule. Instead, an error results when the dataflow is run.
    2. STU-3224
      When opening a Lookup Table artifact, a warning message appears in a dialog box without a header. The hour glass cursor displays over the dialog box.
    3. STU-3222
      Vague error message about input parameters and input attributes not matching semantically when rules contain input parameters that are not bound to input attributes.
    4. STU-3221
      When two or more Assignment Rules are reordered with the Move Up or Move Down buttons in the Rules Editor, the lines connecting input to output attributes become disconnected.
    5. STU-3219
      Cannot rename a Project once it contains a dataflow.
    6. STU-3216
      The ìCreate Table Schema from Composite Typeî wizard creates table columns from Composite Type attributes that have been deleted. This happens when the Composite Type attribute is based on a Shared Atomic Type and that Atomic Type is deleted. The result is that Studio stops working when the wizard tries to complete creation of the Table Schema. To avoid this problem, do not continue in the wizard if one or more of the Composite Type attributes displayed does not show an Atomic Type. To use the Composite Type to create a Table Schema, make sure all of its attributes have valid Atomic Types, either local or shared.
    7. STU-3210
      Converting a Local Composite Type to a Shared Composite Type can cause Studio to stop working when the Rules Editor is opened on an operator that uses attributes from that Composite Type.
    8. STU-3127
      Operations take a long time after Rules Editor has been opened and closed. Performance degradation due to a memory leak.
    9. STU-3125
      Cannot perform operations on one hundred or more output attributes in the Rules Editor.
    10. STU-3092
      Selecting large number of attributes and dragging to a rule's input does not work correctly. Most of the attributes are not mapped to the rule's input.
    11. STU-3084
      Set an Output Operator property to a Schema with a large number of fields causes the attribute propagation to take a very long time or freeze Studio.
    12. STU-3083
      When building a dataflow that uses a large Schema (>300 fields), switching selection from one operator to another involves noticeable delay.
    13. STU-3059
      When building a dataflow that uses a large Schema (>300 fields), creating a rule in the Rules Editor by selecting all attributes simultaneously and dragging them to the takes a very long time.
    14. STU-3048
      Naming requirements that apply to attributes are not applied to rule parameter names. The requirements are the same for both attribute names and rule parameter names.
    15. STU-3043
      Rules Editor does not wrap text in Expression and Function Rules. Same is true of editors for Datascript Modules and Read and Write Custom operators.
    16. STU-3039, STU-3041, STU-3062
      Using Schema with large number of fields (~260) on multiple operators in a dataflow takes up large amount of memory and causes Studio to freeze for long period (~10 minutes). Attempting to map all attributes to a rule's input in the Rules Editor freezes Studio.
    17. STU-3038
      Error message displayed when operators have incompatible Semantic Types is not clearly written.
    18. STU-3029
      Leading or trailing white space on Step name is not displayed and can thus be confusing when the name is used in an error message.
    19. STU-3028
      Leading or trailing white space on Rule name is not displayed and can thus be confusing when the name is used in an error message.
    20. STU-3025
      Dataflow does not correctly validate Required attributes after changes are made to it. [validate attributes]
    21. STU-3021
      Blocked attributes in the Aggregate operator are unblocked when an output attribute with the same name is deleted. The blocked attribute should remain blocked.
    22. STU-3014
      Transform Operator Template does not retain the names assigned to rules.
    23. STU-3012
      The current Project is not selected when assigning a Composite Type to a Lookup Table operator.
    24. STU-3011
      Filter Operator Template adds a nonfunctional rule.
    25. STU-3007
      When running a dataflow from a Deployment Package in Studio, the status pane does not have Copy and Save buttons like those provided when the dataflow is run independent of the Deployment Package.
    26. STU-3004
      When dataflow is deleted, its related objects are left in the "dfp" directory.
    27. STU-2995
      Incorrect error displayed when attribute names do not match after an upgrade. Prior to Version 3.4, case-sensitivity was not enforced on mapped attribute names. In Version 3.4, mapped attributes with same name but unmatched cases are flagged as incompatible. But error message about incompatibility does not indicate that the problem is with mismatched cases in attribute names.
    28. STU-2991
      Schema names not aligned in New Table Schema dialog box.
    29. STU-2990
      Moving back in screens of New Table Schema dialog box loses table selection.
    30. STU-2989
      Next and Back buttons in New SQL Query Schema behave inconsistently.
    31. STU-2985
      Unsaved dataflow with a reference to a Datascript Module displays runtime error indicating the Datascript Module cannot be found.
    32. STU-2982
      Expression Rule needs to display message indicating what text-entry space is used for, same as Expression Lookup Rule does.
    33. STU-2982
      Opening a Repository Workspace takes too long to retrieve the remote lock.
    34. STU-2977
      Misspelling in validation error message.
    35. STU-2957
      Errors in a disabled dataflow Step continue to display until the dataflow is saved.
    36. STU-2949
      When upgrading a pre-Version 3.4 dataflow with an Aggregate operator, the Change function is not handled correctly.
    37. STU-2921
      Write File operator fails at runtime when its Schema contains fields that are not mapped to attributes.
    38. STU-2916
      User is not warned when changing the Composite Type used in a Lookup Table operator that the change might make it impossible to read the Lookup Table.
    39. STU-2912
      Database usernames and passwords are not overwritten properly by a substitution file.
    40. STU-2886
      Incorrect binding is created and wrong message displayed when user drags an output attribute to a rule's ouput with the CTRL key when the attribute has an existing mapping.
    41. STU-2864
      Rule gets multiple input parameters with same name when an input attribute is mapped with the CTRL key.
    42. STU-2860
      When an output attribute mapped to an input attribute with an Assignment Rule is deleted, remnants of the Assignment Rule remain visible.
    43. STU-2852
      When multiple Required attributes are mapped to a rule's output, the attributes' icons should change to diamonds to indicate that they are connected.
    44. STU-2825
      Rules Editor does not have a mechanism to save output attributes as a Shared Composite Type.
    45. STU-2824
      The Rules Editor needs to display the input number on the output attribute to which the input is assigned.
    46. STU-2815
      Incompatibility when upgrading to Version 3.4 produces misleading messages.
    47. STU-2788
      No way to resolve duplicate names when importing output attributes from more than one Composite Type in the Pivot Row operator.
    48. STU-2782
      After upgrading a dataflow with a Read File operator that uses its Reject output to Version 3.4, the dataflow cannot be saved because the Composite Type it uses for rejected records is no longer in its LocalTypes list.
    49. STU-2667
      Datascript Module Editor not marking the loaded Datascript Module as primary.
    50. STU-2612
      Information pane for Transform operator is blank because of missing property description.
    51. STU-2574
      Studio shuts down instead of displaying an error message when user attempts to create a Connection to an Excel spreadsheet with a DSN that does not point to a worksheet.
    52. STU-2328
      Boolean data types in Postgres databases must map to string data type in the expressor Schema, not to integer.
    53. PRO-2284
      Table Schemas generated in Studio 3.3 and 3.4 from Semantic Types or from Delimited Schemas have incorrect information in date-time fields that prevent them from being used in a Version 3.4.1 Write Table operator. Table Schemas created from Semantic Types or Delimited Schemas must be regenerated.
    54. PRO-2272
      The eflowsubst command identifies itself as the ecrypt command.
    55. PRO-2267
      Unmapped attributes produce incorrect output in the Write File operator.
    56. PRO-2266
      The eflowsubst command does not locate dataflow .rpx files correctly.
    57. PRO-2253
      Error message misleading when change function in Aggregate operator fails.
    58. PRO-2252
      Setting "Accept Truncation" when mapping a decimal attribute to an integer Schema field does not work.
    59. PRO-2249
      The ekill command help text generated by -h option refers to "drawing" and "VDX file." That is old terminology from earlier versions of expressor software for what is "dataflow" and ".rpx file" in expressor 3.x versions.
    60. PRO-2248
      The etask command help text generated by -h option refers to "drawing." That is old terminology from earlier versions of expressor software for what is "dataflow" in expressor 3.x versions.
    61. PRO-2240
      The Write Table operator hangs while reading data when the Batch size is set to 1.
    62. PRO-2238
      When using the etask command to substitute an encrypted database password that contains an equal sign, the substitution fails.
    63. PRO-2237
      The etask command cannot run a dataflow from a Deployment Package unless it is in the working directory. Should accept pathname to a directory that is not the working directory.
    64. PRO-2233
      Misleading error message when DB2 database has non-nullable fields that are not represented in the expressor Schema.
    65. PRO-2231
      An SQL statement in the SQL Query operator fails to run even though it is a valid SQL statement.
    66. PRO-2230
      The Write Table operator cannot write currency data to a Postgres database.
    67. PRO-2227
      The reject output for sorted aggregations is wrong if the reason for rejection is violation of an output attribute constraint. If the reject reason is an error in the aggregation datascript rule, the output is as expected. The problem is that the first record sent to the reject port is the current input record, which for a sorted aggregation, is the first record of the next aggregation group, not the last record of the aggregation group that produced the output record that violates the constraint.
    68. PRO-2221
      The etask command accepts an invalid Step number with the -s option.
    69. INS-648
      Installation program installs in the same directory as a previous version, which overwrites files that should be retained.
    70. INS-647
      Not all expressor objects are removed from the Windows Start Menu after uninstall.
    71. INS-644
      Studio does not launch at the end of installation because of an error in the Windows Registry.
    72. DOC-262
      Documentation in the Substitute Parameters in a Dataflow topic and in description of etask command are incorrect. In Substitute Parameters in a Dataflow topic, etask command line for substituting an individual parameter on the command line does not include -D parameter and full syntax of parameter name.
      Correct etask command:
      Code:
      etask -x Sample_Dataflow.rpx -D DeploymentPath 
                 -P Step_number@operator_name@property_name=property_value
      Also in Substitute Parameters in a Dataflow topic, the etask command line for running a dataflow with a substitution file does not include the -S parameter.
      Correct syntax is:
      Code:
      etask -x dataflow_name -D deployment_package_pathname -S substitution_filename
      etask command topic describes parameter name syntax with -S instead of -P.
    73. DOC-261
      Need documentation for specifying one of the available options on Operator properties with multiple options when substituting parameters.
Gravatar as Default Avatar by 1e2.it

SEO by vBSEO 3.6.0