Version:

Profiles 0.18.x Changelog

Changelog for Profiles v0.18.x.

Version 0.18.4

25 October 2024

Bug Fixes

  • The schema name has been prepended to the drop statements executed during cleanup. This ensures that deletions are always performed in the correct schema.

Version 0.18.3

16 October 2024

Bug Fixes

  • Fixed a bug that causes cleanup of materials fail due to current transaction is aborted error. With the fix, if cleanup of one material fails (for some reason, ex: other objects depend on it), the cleanup of other expired materials should continue.

Version 0.18.2

4 October 2024

Bug Fixes

  • Resolved a migration bug which occurs when there are nested model folders containing non yaml files.
  • Cleanup with flag --remove_latest_view_ptrs was not respecting retention time period set in pb_project.yaml. This is fixed.

Version 0.18.1

3 October 2024

Bug Fixes

  • Fixed issue in default ID stitcher for run_type: discrete.
  • Resolved a bug that occurs when a project contains multiple entities with same cohort name.
  • During migration, pb was skipping non-YAML files, which caused scheduled runs to fail. This is fixed.

Version 0.18

26 September 2024
Schema version: 80

What’s New

  • Cohort model now lets you perform filtering using a filter_expression followed by AND/OR list of expressions, for example:
models:
   - name: high_value_us_residents
     model_type: cohort
     model_spec:
       ...
       filter_expression:
         AND:
           - {{ user.Var('country') }} = 'US'
           - {{ user.Var('salary') }} > 10000
  • You can define the retention_period for each model of a project. Further, the pb cleanup materials --expired command cleans up the materials beyond the defined retention period.
  • Referring other entity_vars/input_vars is now simplified. You can use {{entityName.entity_varName}} instead of the earlier one {{entityName.Var("entity_varName")}}. Note that the earlier syntax also works fine.
  • You can use features of an SQL model while using a cohort. To do so, specify the entity_key or entity_cohort in the model_spec of an SQL model.
  • pb cleanup materials --concurrency - A new command which enables concurrency for cleanup, by defining the number of concurrent workers for cleanup. The default value is 1.
  • The default offset value while executing pb run command is now updated to 0. It was 30 minutes earlier.
  • A new flag --end_time_offset is added to the compile/run commands for adding an offset to the end timestamp, in a human readable format. It means that RudderStack does not use any data you load in the warehouse after the offset time has elapsed for that run. For example, pb run --end_time_offset=45m ensures that RudderStack does not use any data older than 45 minutes from the run’s start time. Note that you can’t use this new flag with the seq_no or end_time flags.
  • You can now import Packages starting with SSH URLs, for example, ssh://git@host:port/path.git.
  • You can run or import projects hosted on S3 as packages by adding block_store_creds in your site configuration file. To run the project, execute pb run -p s3://<url> command.
  • Running a project with the --migrate_on_load flag now stores generated artifacts in the output subfolder instead of migrations.
  • For an entity_var/input_var, the default key has been renamed to default_value.
  • Simplified the project created using pb init pb-project by removing the dependency on corelib package , sample SQL model, model contracts and CSV’s in the inputs file.
  • RudderStack now uses INNER JOIN instead of RIGHT JOIN when calculating entity_vars. This results in performance improvement and also prevents some values from getting lost.
  • Feature view model with main_id as an identifier is created by default.
  • Schema has been updated from 72 to 80.

Improvements

  • By default, RudderStack ignores all the blank values in the ID stitcher model.
  • There is a slight aesthetic improvement in HTML reports generated using pb show idstitcher-report command.
  • Relevant errors are now thrown if you specify an unknown YAML key in the model definition.

Bug Fixes

  • validity_time key has been removed.
  • The pb validate access command, for Databricks, now checks only for the necessary permissions and not for ALL the privileges.

Known Issues

BigQuery

  • pb validate access command does not work for BigQuery.

Redshift

  • If two different users create material objects on the same schema, RudderStack gives an error during cleanup when trying to drop views created by the other user, like user_var_table.
  • Cross database references can fail on Redshift for a few clusters.
  • While creating Activations, validation for Redshift does not work correctly in the RudderStack dashboard.

Databricks

  • Concurrency does not work for cleanup.

Other issues

  • Linux users might see this warning for all command runs - you can ignore it: WARN[0000]log.go:228 gosnowflake.(*defaultLogger).Warn DBUS_SESSION_BUS_ADDRESS envvar looks to be not set, this can lead to runaway dbus-daemon processes. To avoid this, set envvar DBUS_SESSION_BUS_ADDRESS=$XDG_RUNTIME_DIR/bus (if it exists) or DBUS_SESSION_BUS_ADDRESS=/dev/null.
  • pb insert does not work for Redshift, Databricks, and BigQuery.
  • If you are referring a public package in the project and get ssh: handshake failed error, then you’ll have to manually remove the entire folder from WhtGitCache to make it work.
  • Timegrains is an experimental feature. There might be some undiscovered issues.

Questions? Contact us by email or on Slack