• Recent Articles

    jsiwila

    Processing Records Rejected by a Read File Operator

    This is a companion article to Processing Rejected Records. That deals with records rejected by the Read Excel operator and applies to Read operators in other expressor Extensions and to the Read Custom operator.

    Parsing records rejected by the Read File operator is simpler than parsing records from Read Excel and related operators. The Read File operator retains the sequence of fields and attributes when it writes them to the RecordData field of the Reject Record Schema. Because of that, it is not necessary for the Read File operator to write header records to describe the field and attribute sequence. That greatly simplifies the datascript required to parse the RecordData field and reconstruct the rejected records.

    As explained in the Processing Rejected Records article, when a Read operator... read more
    jsiwila 05-14-2012, 03:33 PM
    jlifter

    expressor tutorials

    In this section you will find a collection of older tutorials that demonstrate the features of expressor Studio and the entire expressor Data Integration Platform. Though... read more
    jlifter 05-08-2012, 04:08 PM
    jsiwila

    Release Notes - expressor 3.6.0, 3.6.1, 3.6.2, 3.6.3, and 3.6.4

    expressor 3.6.4 fixes four bugs, three in Studio (STU-4728, STU-4730, and STU-4744) and one having to do with reading Informix databases from dataflows running on Linux (PRO-2634). See the resolved issues section below for a description of each of the fixed bugs.

    Also, see STU-4729in the known issues section below. STU-4729 describes an issue encountered when upgrading artifacts in a Repository Workspace.

    Note that the Informix ODBC drivers shipped with expressor software do not support Unicode on Linux.

    expressor 3.6.3 fixes three bug in Studio: one covering memory leaks, another dealing with binding to the Oracle NUMBER data type, and the third corrects unnecessary rounding when converting a double precision value to decimal. See STU-4641, STU-4625, and PRO-2622 under resolved issues below. Also see STU-4657 in the known issues section immediately following for a workaround to a problem dealing with NUMBER... read more
    jsiwila 05-08-2012, 09:02 AM
    jsiwila

    Processing Rejected Records

    When records produce errors because they violate constraints set on Composite Type
    attributes or other reasons, the operator that encounters the error can handle them
    by skipping them, aborting the dataflow, or rejecting the offending records. In some
    cases, it is sufficient to simply send rejected records to a Write File operator
    and examine the records in the output file. If the intent is, however, to correct
    or otherwise use those records, examining each error and changing the data could
    be very cumbersome. The more efficient approach would be to reprocess the records
    as they come out the reject port.

    Records rejected by input operators such as Read File, Read Table, and Read Excel are structured into the following fields:

    RejectType
    RecordNumber
    RecordData
    RejectReason
    RejectMessage

    The record data as it was constituted before being rejected is contained in the RecordData field. To process that data, it must first be reconstructed from the rejected record format. Several factors affect the reconstruction. The order of the record data fields can be different from the order represented in the original Schema, and some of the records emitted from the reject port do not contain record data. For example, RejectType 1 errors are constraint violations, but before they are emitted, a RejectType 4 record is emitted. The RejectType 4 record contains the record data field order for the subsequent RejectType 1 errors in its RecordData field. The RejectTypes are fully explained in the Using the Reject Port section of the Read Custom operator topic in the product documentation.

    Note: All non-input operators that have a reject port emit rejected records with the existing attributes of the record, that is, they do not restructure the records the way input operators do. Reprocessing records rejected by non-input operators do not have to be reconstructed
    ... read more
    jsiwila 05-07-2012, 10:00 AM
  • Release Notes - expressor 3.5.2

    expressor 3.5.2 is a 32 bit ETL application that can be deployed onto computers running the Windows operating system. expressor Studio should be installed onto computers running Windows XP or Windows 7, 32 or 64 bit operating systems. expressor Repository and expressor Data Processing Engine may be installed onto computers running the Windows Server 2003 or 2008, Windows XP, SP3, or Windows 7 Professional or Enterprise, 32 or 64 bit operating systems.

    The following list of search terms can help locate specific issues. Many of the terms have more than one reference in the list below. The terms occur either within the text of the issue description or in a search term list appended to the issue description.

    Use your browser's Find box to locate the references.
    • validate attributes
    • large number of attributes
    • table schema
    • delimited schema
    • lookup keys
    • output attributes
    • earlier versions
    • error handling
    • datascript module reference
    • function rule

    What's new in expressor 3.5.2

    expressor 3.5.2 includes the following new features and functionality.
    The 3.5.2 installer now includes all three extensions (Excel, QlikView, Salesforce). Users must have a valid non-Community license to run dataflows that utilize these extensions. Please review information specific to the Desktop Edition before trying to use these extensions.

    • Excel extension: Allows customers to natively read and write XLSX (Excel 2008 Workbook) and XLS (Excel 97-2003 Workbook) files. Excel connectivity in expressor no longer requires use of the Microsoft Excel ODBC driver.
      • The Excel Schema artifact captures the metadata for an Excel file and maps it to a Semantic Type. It handles all relevant type conversions to/from the Excel file into the semantic type model and allows users to map Excel data to a common set of metadata types and constraints. This schema can be created from an existing Semantic Type in expressor making it easy to create standard Excel data files from any expressor dataflow.
      • Read Excel and Write Excel operators can read and write XLSX and XLS files. As with all operators, many properties for these operators be set in the same manner as other substitutable properties in 3.4.



    • QlikView extension: This new Extension enables reading, writing, and processing of QVX-formatted data.
      • The QlikView Extension can be activated and used with any licensed Studio 3.5.1 version. Dataflows can be built using components from the Extension with a free Community license, but they will not run until a Desktop or Standard Edition license is installed.



    • Salesforce Dot Com extension: An expressor extension that supports reading and writing to Salesforce Dot Com databases.
      • Dataflows using this functionality may be developed with any expressor edition. To run dataflows using this functionality requires either Desktop or Standard licensing.


    Installation

    If you have used an earlier version of expressor Studio, be certain to back up your workspaces before installing expressor Studio 3.5.2. In your My Documents folder (or in whichever folder you stored your expressor workspaces), make a copy of the expressor folder. If you later decide to uninstall expressor 3.5.2 and re-install the previous version, you will need to delete any workspaces created with expressor 3.5.2 and return to the workspaces created with the prior version.

    Note: If you receive a message during installation on Windows XP that indicates hotfix 943326 is not installed on your system, use this URL to download: http://support.microsoft.com/kb/943326.

    If you receive a message during installation that hotfix 967328 is not installed on your system, contact Microsoft support, starting at the following URL: http://support.microsoft.com/.

    If you want to completely remove prior installations and all workspaces and project artifacts:
    • Use the Windows Control Panel utility to uninstall expressor Studio.
    • Delete the directories:
      • Windows 7:
        • C:\Usersusername\AppData\Roaming\expressor
        • C:\Users\username\AppData\Local\expressor
        • C:\Users\username\AppData\Local\expressor_software

      • Windows XP:
        • C:\Documents and Settings\username\ApplicationData\expressor


    • Discard the download file expressorStudioInstaller.exe from prior installations.
    • Delete, or rename, the Workspaces directory.
      • Windows 7:
        • C:\Users\username\Documents\expressor\Workspaces

      • Windows XP:
        • C:\Documents and Settings\username\My Documents\expressor\Workspaces



    The following known issues have been identified.

    1. STU-4091
      Creating an Excel schema from a type or from the output of an upstream operator should not be selected. You must create an Excel schema from an existing Excel file.
    2. STU-3946
      When a dataflow aborts while writing to an Excel file, data in the target file can be corrupted.
    3. STU-3945
      The Read Excel operator cannot read formula cells in an .xslx file.
    4. STU-3886
      Incorrect columns selected and mapped with SQL Query operator after modifying mapping set.
    5. STU-3636
      Studio unable to connect using Teradata ODBC driver.
    6. STU-3630
      Overwriting an Operator Template does not flag the Template as modified in a Repository Workspace, and as a result, the change is not committed to the Repository.
    7. STU-3628
      After opening a Workspace with Studio 3.5, cannot open it in the earlier version of Studio with which it was created.
      Workaround 1: Create a backup copy of Workspaces before opening them in Studio 3.5.
      Workaround 2: In Studio 3.5, disable all Extensions in the Workspace. The Workspace can then be reopened in the version of Studio in which it was created.
    8. STU-3335
      When reading large Salesforce.com objects (e.g., 1.5Mb), the Salesforce.com data type base64 produces an error. On smaller objects, it converts to the expressor String data type as expected.
    9. STU-3186
      Rules Editor's type-ahead feature does not recognize changes to input parameters. It displays old input parameter names instead of the current input parameter names. [validate attributes]
    10. STU-3112
      In the Rules Editor, selecting and dragging a large number of output attributes to a rule's output parameters takes a long time to process.
    11. STU-3106
      Noticable delay when opening the Rules Editor for an operator that has a large number of input attributes. [large number of attributes]
    12. STU-3085
      Changes to dataflows are not indicated in Deployment Packages the dataflows are contained in.
    13. STU-2951
      Studio does not always activate the most permissive license installed; it sometimes activates the Studio-only license even though a more permissive license has been installed.
    14. STU-2773
      Input attributes are not available in the Function Rule datascript editor if the Function Rule was created by conversion from another type of rule. Even if inputs are added manually to the rule, they do not appear when typing datascript. However, all works fine if the Function Rule is created from scratch.
    15. STU-2728
      There is no validation to ensure Lookup Tables have at least one attribute that is not a key. However, if a Lookup Table does not have at least one attribute that is not a key, then it contains nothing to lookup. [lookup keys]
    16. STU-2163
      No notification of lost Datascript Module reference. When a Project’s Library Reference is removed and the Library contains a Datascript Module that is used by an open dataflow, no validation error is displayed. The broken reference to the Datascript Module is not manifest until the dataflow runs and fails.
    17. STU-1942
      When defining "allowed values" for a String data constraint, must include default values in list of allowed values.
    18. STU-1566
      Quote character, field delimiter, and record delimiter cannot be the same in a delimited schema, but validation of the schema does not fail if they are the same.
      The characters used as the quotation mark and the field and record delimiters cannot be the same. This restriction is documented in the Create Delimited Schema topic in online help. But violation of this restriction is not indicated when the settings are specified for the schema. The conflicts will, however, cause an error when the dataflow runs.
    19. PRO-2432
      Schemas for rejected records do not preserve the order of attributes when writing the rejected record output.
    20. PRO-2426
      When substituting parameters on the etask command line, cannot substitute a file name with the @ character.
    21. PRO-2354
      When the Read Table and SQL Query operators reject a record, the CSV representation of the rejected record in the RecordData attribute might not contain valid CSV data.
    22. PRO-2333
      The utility.encrypt function incorrecly calls base64 encoding code instead of encryption code.
    23. PRO-2296
      The eflowsubst command's -O option places the substitution file in the external directory under the Deployment Package instead of the Deployment Package's dataflow directory.
    24. PRO-2287
      When Generate record is chosen as the On Miss action in a Lookup rule, a value must be supplied for all the output parameters in the rule, even those that are not mapped to output attributes. To work around this, a meaningless value can be assigned to the output parameters that are not mapped to output attributes.
    25. PRO-2259
      The Unique key value in a Lookup Table gets changed even after an error indicates that changing a Unique key value is not allowed. [lookup keys]
    26. PRO-2220
      The eflowsubst command overwrites an existing substitution file without warning.
    27. PRO-2140
      When a dataflow created with expressor Studio Version 3.3 is opened in Version 3.4, the comment blocks before and after created in the Transform Editor are visible in the Rules Editor, though they are meaningless in the Rules Editor.
    28. PRO-2028
      Reject options for Error handling do not work when a Write Table operator is connected to an Informix database.
    29. PRO-1925
      No error is generated when reading in a decimal that contains more digits than the internal representation can handle.
    30. PRO-1812
      Cannot write a nil value to a Teradata long varchar column.
    31. PRO-1655
      Bulk load mode does not work when writing to a Sybase database.
    32. PRO-1628
      Decimal columns in Informix databases import to Read Table operator as SMALLFLOAT data type (was Bug 5659).
    33. INS-680
      Studio Version 3.5.1 will not start when installed after earlier Studio version is uninstalled. Some antivirus programs can cause this behavior when the Studio installation program is run from the administrator account.
      Workaround: uninstall Studio Version 3.5.1, turn off the antivirus program, reinstall Studio Verseion 3.5.1, and turn antivirus program back on.
    34. INS-644
      Windows Registry sometimes displays error when Studio is launched after installation completes.
      Workaround: uninstall Studio and delete the expressor folder in the Program Files directory or wherever Studio was installed.
    35. DOC-292
      The Create an Excel Schema topic in the online help contains two notes that indicate field data types are determined by reading the first non-data row. The row used to determine data types is the row that begins at the cell specified by the Top left cell for data setting in the Schema.
    36. DOC-189
      Cannot view expressor documentation from Start Menu (expressor>expressor3>expressor Documentation) with Google Chrome browser. Chrome requires HTML files be served by a web server, not read directly on a local system.


    Interoperability Issues.

    1. STU-3635
      Using an ODBC driver for an old Excel format (.xls) to connect to a later version Excel file produces an error message and might cause Studio to stop working.
      Workaround: use the Excel Extension (available in expressor 3.5.2) to read and write Excel files, both .xls and .xlsx.
    2. STU-3566
      Oracle Table Schema fields with the NUMBER data type can generate constraints on Composite Type Attributes that cause valid data from the table to be rejected. The constraints are the result of the ODBC driver's interpretation of NUMBER when precision and scale are not defined. For Attributes mapped to Oracle Table Schema fields that have the NUMBER data type, users should set the constraints manually to ensure they match the table data.

    The following issues identified in earlier product releases have been resolved.

    1. DOC-269
      Version 3.5 uses a new method to encrypt passwords. Encrypted values saved in substitution files from earlier versions of expressor software must be regenerated with Version 3.5.
    2. STU-3920
      Studio crashes when an Operator Template is created and then the operator from which was created is deleted.
    3. STU-3773
      When creating a delimited schema, Studio can't distinguish between LF and CR+LF. This problem presents when CR+LF is used as the record delimiter, and LF is present as a valid non-delimiting character in the record data. Note that this is not a problem in the Engine.
    4. STU-3770
      Need to use CR alone as a delimiter when creating a Delimited File Schema.
    5. STU-3753
      An ExpressionRequires statement remains after the Expression Rule that uses the statement to call a Datascript Module is changed to a Function Rule.
    6. STU-3737
      The option to create a new Schema from Upstream Output does not display on the Write Salesforce operators's Schema property drop-down menu.
    7. STU-3707
      Editing mapping between Salesforce Schema fields and attributes is not allowed, but the Edit and Edit Mapping buttons are not disabled.
    8. STU-3701
      Read Salesforce Reject Record error handling set causes dataflow to fail when run.
    9. STU-3699
      Selected objects for Salesforce Schema creation are deselected if wizard Back button is used.
    10. STU-3696
      Studio crashes when the Copy Results button is used to copy a very large number of messages.
    11. STU-3695
      No error message from Studio when Salesforce password has expired. Salesforce Connection simply does not connect to Salesforce.com.
    12. STU-3690
      Documented that String values that contain commas are converted to null values when mapping Salesforce.com data types to expressor data types.
    13. STU-3666
      Syntax highlighting in Rules Editor does not show mismatches correctly. No runtime error reported, but data loss results.
    14. STU-3661
      Dataflow continutes to run in Studio even though referenced Datascript Module has been removed.
    15. STU-3618
      Join operator not picking up changed new Library References in the Project containing the dataflow.
    16. STU-3585
      When a joiner function in a Join operator changes the data type of a parameter, the new data type is correctly assigned to the mapped Attribute. But the Dataflow's Message panel displays an error indicating that data types do not match. The same error is displayed when a Join operator output Attribute does not have the same data type as an identically named Attribute on Input1, even though the two Attributes are not mapped to one another.
    17. STU-3517
      Dataflow continutes to run in Studio even though referenced Datascript Module has been renamed.
    18. STU-3358
      Dataflow that uses Salesforce Extension will not run if the Salesforce Extension is enabled after the dataflow was opened in Studio.
    19. STU-2776
      Join operator in Studio 3.4 does not always run as expected in dataflows created with earlier versions of Studio.
    20. PRO-2407
      eflowsubst command generates substitution files that produce errors when used with etask command.
    21. PRO-2406
      Failure of etask command with unhandled exception-the file already exists might require tmp directory to be cleared. If problem persists, check for other file system problems.
    22. PRO-2403
      Funnel operator not processing both inputs.
    23. PRO-2402
      Constraints on datetime attributes are not maintained when running a dataflow with a Read Salesforce operator.
    24. PRO-2399
      Data Processing Engine must check that all inputs to Funnel operator have identical Composite Types.
    25. PRO-2294
      Running a dataflow containing a Write Parameters operator in a Free Studio Edition produces vague error message. Error message should indicate that the Write Parameters operator cannot be used with Free Studio.
    26. INS-688
      Studio installation fails without producing error message.
    27. INS-684
      Installation program crashes if user clicks X to close window instead of Retry button after entering an incorrect password for the Repository.
    28. DOC-284
      References to Hotfix 967328 changed because the hotfix is no longer available on the Microsoft web site.
    29. DOC-283
      Obsolete documentation for datascript.optimize.flag removed.
Gravatar as Default Avatar by 1e2.it

SEO by vBSEO 3.6.0