Science and technology

Audit your sharding database algorithm

Thanks to the ShardingSphere community’s steady evaluate and suggestions to develop options similar to data sharding and browse/write splitting, our crew discovered that some customers create many shards when utilizing the information sharding characteristic.

In such instances, there could be 1,000 bodily tables comparable to a sharding logical desk, which disturbs customers.

For occasion, a SELECT * FROM t_order assertion will result in a full-route, which is clearly not the case for OLTP. This SQL could be positioned in one other Proxy to keep away from blocking different requests.

However, if customers will not be acquainted with Proxy or the best way to write a the place situation and do not know that sharding is just not supported on this situation, a full-route remains to be required.

A full-route can decrease the efficiency of Proxy and even consequence within the failure of an inexpensive request. Imagine that there are 1,000 shards in a bodily database. If they’re executed in parallel, 1,000 connections are wanted, and if in serial, the request can result in a timeout. For this cause, neighborhood customers requested whether or not the unreasonable request might be intercepted instantly.

Our crew thought of the difficulty for some time. One choice is to easily block the full-route operation. Doing so requires a verify within the code and including a swap to the configuration file. On the opposite hand, if the consumer later must set a desk to read-only or requires the replace operation to hold a restrict, does that imply the code and configuration change once more? This strategy clearly goes towards the pluggable logic of Proxy.

In response to the above issues, the not too long ago launched Apache ShardingSphere 5.2.0 offers customers with auditing for the SQL sharding perform. The audit can both be an interception operation or a statistical operation. Similar to the sharding and distinctive key era algorithms, the audit algorithm is plugin-oriented, user-defined, and configurable.

[ Related read 5 new improvements in Apache ShardingSphere ]

Next, I’ll elaborate on the implementation logic for auditing knowledge sharding with particular SQL examples.

Audit for sharding interface

The entrance to Apache ShardingSphere’s audit is within the org.apache.shardingsphere.infra.executor.verify.SQLCheckEngine class, which can invoke the verify methodology of the SQLChecker interface. Currently, the ShardingSphere audit incorporates an audit for permission (confirm username and password) and an audit for sharding.

This instance focuses on the mum or dad interface applied within the ShardingAuditChecker of audit for sharding.

(Yacine Si Tayeb, CC BY-SA 4.0)

You can study its working rules rapidly by viewing the verify code of org.apache.shardingsphere.sharding.checker.audit.ShardingAuditChecker.

public interface ShardingAuditAlgorithm extends ShardingSphereAlgorithm {
   
    /**
     * Sharding audit algorithm SQL verify.
     *
     * @param sqlStatementContext SQL assertion context
     * @param parameters SQL parameters
     * @param grantee grantee
     * @param database database
     * @return SQL verify consequence
     */

    SQLCheckResult CHECK(SQLStatementContext<?> sqlStatementContext, List<Object> parameters, Grantee grantee, ShardingSphereDatabase DATABASE);
}

This methodology obtains the audit methods of all of the sharding tables concerned and invokes the audit algorithms configured in every sharding desk audit technique. An exception is exhibited to the consumer if an audit algorithm fails to move.

Some customers could marvel what disableAuditNames does right here. The sharding audit additionally permits customers to skip this course of. In some instances, customers could have to execute SQL that ought to have been blocked by the audit, and they’re conscious of the impression of this SQL.

Users can make the most of the Hint: disableAuditNames to skip audit interception, which shall be described with sensible examples later. The Proxy Administrators can configure enableHintDisable to regulate whether or not to permit customers to skip this course of. The default worth is true, indicating {that a} Hint-based skip is permitted.

Audit for sharding algorithm

The audit for sharding algorithm interface org.apache.shardingsphere.sharding.spi.ShardingAuditAlgorithm is inherited from SPI class ShardingSphereAlgorithm. It inherits sort and props properties and defines its personal verify methodology. If you need to customise your audit algorithm, simply implement the interface and add it to INF.providers.

(Yacine Si Tayeb, CC BY-SA 4.0)

public interface ShardingAuditAlgorithm extends ShardingSphereAlgorithm {
   
    /**
     * Sharding audit algorithm SQL verify.
     *
     * @param sqlStatementContext SQL assertion context
     * @param parameters SQL parameters
     * @param grantee grantee
     * @param database database
     * @return SQL verify consequence
     */

    SQLCheckResult CHECK(SQLStatementContext<?> sqlStatementContext, List<Object> parameters, Grantee grantee, ShardingSphereDatabase DATABASE);
}

Apache ShardingSphere implements a normal audit for sharding algorithm org.apache.shardingsphere.sharding.algorithm.audit.DMLShardingSituationsShardingAuditAlgorithm, particularly the above-mentioned SQL assertion that intercepts the full-route.

The algorithm makes choices by figuring out whether or not the sharding situation is null. Of course, it will not intercept broadcast tables and non-sharding tables.

public remaining class DMLShardingSituationsShardingAuditAlgorithm implements ShardingAuditAlgorithm {
   
    @Getter
    non-public Properties props;
   
    @Override
    public void init(remaining Properties props) {
        this.props = props;
    }
   
    @SuppressWarnings({"rawtypes", "unchecked"})
    @Override
    public SQLCheckResult CHECK(remaining SQLStatementContext<?> sqlStatementContext, remaining List<Object> parameters, remaining Grantee grantee, remaining ShardingSphereDatabase DATABASE) {
        IF (sqlStatementContext.getSqlStatement() instanceof DMLStatement) {
            ShardingRule rule = DATABASE.getRuleMetaData().getSingleRule(ShardingRule.class);
            IF (rule.isAllBroadcastTables(sqlStatementContext.getTablesContext().getTableNames())
                    || sqlStatementContext.getTablesContext().getTableNames().stream().noneMatch(rule::isShardingDesk)) {
                RETURN NEW SQLCheckResult(TRUE, "");
            }
            ShardingConditionEngine shardingConditionEngine = ShardingConditionEngineFactory.createShardingConditionEngine(sqlStatementContext, DATABASE, rule);
            IF (shardingConditionEngine.createShardingSituations(sqlStatementContext, parameters).isEmpty()) {
                RETURN NEW SQLCheckResult(FALSE, "Not allow DML operation without sharding conditions");
            }
        }
        RETURN NEW SQLCheckResult(TRUE, "");
    }
   
    @Override
    public String getType() {
        RETURN "DML_SHARDING_CONDITIONS";
    }
}

I’d prefer to introduce one other audit for the sharding algorithm: LimitRequiredShardingAuditAlgorithm. This algorithm can intercept SQL with out carrying a restrict within the replace and delete operations.

As this algorithm is much less common, it’s not at the moment built-in into Apache ShardingSphere. As you may see, it is extremely simple to implement a customized algorithm, which is why the audit for sharding framework is required. Thanks to its plugin-oriented structure, ShardingSphere boasts nice scalability.

public remaining class LimitRequiredShardingAuditAlgorithm implements ShardingAuditAlgorithm {
   
    @Getter
    non-public Properties props;
   
    @Override
    public void init(remaining Properties props) {
        this.props = props;
    }
   
    @SuppressWarnings({"rawtypes", "unchecked"})
    @Override
    public SQLCheckResult CHECK(remaining SQLStatementContext<?> sqlStatementContext, remaining List<Object> parameters, remaining Grantee grantee, remaining ShardingSphereDatabase DATABASE) {
        IF (sqlStatementContext instanceof UpdateStatementContext && !((MySQLUpdateStatement) sqlStatementContext.getSqlStatement()).getLimit().isPresent()) {
            RETURN NEW SQLCheckResult(FALSE, "Not allow update without limit");
        }
        IF (sqlStatementContext instanceof DeleteStatementContext && !((MySQLDeleteStatement) sqlStatementContext.getSqlStatement()).getLimit().isPresent()) {
            RETURN NEW SQLCheckResult(FALSE, "Not allow delete without limit");
        }
        RETURN NEW SQLCheckResult(TRUE, "");
    }
   
    @Override
    public String getType() {
        RETURN "LIMIT_REQUIRED";
    }
}

Use audit for sharding

Audit for sharding requires you to configure an audit technique for logical tables. To assist you to get began rapidly, its configuration is identical as that of the sharding algorithm and the sharding key worth generator.

There is an algorithm definition and technique definition, and a default audit technique can be supported. If the audit technique is configured within the logical desk, it impacts solely that logical desk.

If defaultAuditTechnique is configured within the logical desk, it takes impact for all of the logical tables beneath the sharding rule. Auditors are just like ShardingAlgorithms, auditStrategy to databaseStrategy, and defaultAuditTechnique to defaultDatabaseStrategy or defaultTableStrategy.

Please check with the next instance. Only the configuration of the audit for sharding is displayed. You should configure the sharding algorithm and knowledge supply your self.

guidelines:
  - !SHARDING
    TABLES:
      t_order:
        actualDataNodes: ds_${0..1}.t_order_${0..1}
        auditStrategy:
          auditorNames:
            - sharding_key_required_auditor
          enableHintDisable: TRUE
    defaultAuditTechnique:
      auditorNames:
        - sharding_key_required_auditor
      enableHintDisable: TRUE
    auditors:
      sharding_key_required_auditor:
        TYPE: DML_SHARDING_CONDITIONS

Step 1: Execute a question operation. An error is displayed because the audit technique for intercepting the full-database route is configured.

mysql> SELECT * FROM t_order;
ERROR 13000 (44000): SQL CHECK failed, error message: NOT enable DML operation WITHOUT sharding circumstances

Step 2: Add HINT. The title of the HINT is /* ShardingSphere trace: disableAuditNames */,and disableAuditNames is adopted by the auditorsNames configured within the previous command.

If a number of names exist, separate them with areas similar to/* ShardingSphere trace: disableAuditNames=auditName1 auditName2*/. After utilizing HINT, you may see that the SQL operation is efficiently executed.

mysql> /* ShardingSphere trace: disableAuditNames=sharding_key_required_auditor */ SELECT * FROM t_order;
+----------+---------+------------+--------+
| order_id | user_id | address_id | STATUS |
+----------+---------+------------+--------+
|       30 |      20 |         10 | 20     |
|       32 |      22 |         10 | 20     |
+----------+---------+------------+--------+
2 ROWS IN SET (0.01 sec)

Note: HINT requires you to switch the server.yaml configuration of Proxy. In addition, if you’re utilizing MySQL terminal to hook up with Proxy instantly, you must add the -c property—in any other case, HINT feedback shall be filtered out of the MySQL terminal and won’t be parsed by Proxy on the backend.

guidelines:
  - !SQL_PARSER
    sqlCommentParseEnabled: TRUE
    sqlStatementCache:
      initialCapacity: 2000
      mostSize: 65535
    parseTreeCache:
      initialCapacity: 128
      mostSize: 1024
props:
  proxy-trace-enabled: TRUE
mysql -uroot -proot -h127.0.0.1 -P3307  -c

DistSQL with audit for sharding

As you may see from the release notes, Apache ShardingSphere 5.2.0 helps the next DistSQL with audit for sharding perform:

CREATE SHARDING AUDITOR
ALTER SHARDING AUDITOR
SHOW SHARDING AUDIT ALGORITHMS
The following DistSQL shall be supported IN future releases:

DROP SHARDING AUDITOR
SHOW UNUSED SHARDING AUDIT ALGORITHMS
CREATE SHARDING TABLE RULE # together with AUDIT_STRATEGY

This publish launched how audit for sharding works with particular examples. I imagine you have already got a fundamental understanding of this perform and may use it everytime you want or use a customized algorithm.

You are additionally welcome to submit normal algorithms to the neighborhood. If you might have any concepts to contribute or encounter points together with your ShardingSphere, be happy to publish them on GitHub.


This article initially appeared on ShardingSphere 5.2.0: Audit for sharding intercepts unreasonable requests in multi-shards scenarios and is republished with permission.

Most Popular

To Top