Open-source News

Your guide to DistSQL's cluster governance capability

opensource.com - Wed, 08/24/2022 - 15:00
Your guide to DistSQL's cluster governance capability Raigor Jiang Wed, 08/24/2022 - 03:00 1 reader likes this 1 reader likes this

Apache ShardingSphere 5.0.0-Beta version with DistSQL made the project even more beloved by developers and ops teams for its advantages, such as dynamic effects, no restart, and elegant syntax close to standard SQL. With upgrades to 5.0.0 and 5.1.0, the ShardingSphere community has once again added abundant syntax to DistSQL, bringing more practical features.

In this article, the community co-authors will share the latest functions of DistSQL from the perspective of cluster governance.

ShardingSphere clusters

In a typical cluster composed of ShardingSphere-Proxy, there are multiple compute nodes and storage nodes, as shown in the figure below.

Image by:

(Jiang Longtao and Lan Chengxiang, CC BY-SA 4.0)

To make it easier to understand, in ShardingSphere, we refer to proxy as a compute node and proxy-managed distributed database resources (such as ds_0 or ds_1) as resources or storage nodes.

Multiple proxy or compute nodes are connected to the same register center. They share configuration and rules, and they can sense each other's online status. These compute nodes also share the underlying storage nodes, so they can perform read and write operations to the storage nodes at the same time. The user application is connected to any compute node and can perform equivalent operations.

Through this cluster architecture, you can quickly scale proxy horizontally when compute resources are insufficient, reducing the risk of a single point of failure and improving system availability. The load-balancing mechanism can also be added between the application and compute node.

More great content Free online course: RHEL technical overview Learn advanced Linux commands Download cheat sheets Find an open source alternative Explore open source resources Compute node governance

Compute node governance is suitable for cluster mode. For more information about the ShardingSphere modes, please see Your detailed guide to Apache ShardingSphere's operating modes.

Cluster preparation

Take a standalone simulation of three proxy compute nodes as an example. To use the mode, follow the configuration below:

mode:
type: Cluster
repository:
type: ZooKeeper
props:
namespace: governance_ds
server-lists: localhost:2181
retryIntervalMilliseconds: 500
timeToLiveSeconds: 60
maxRetries: 3
operationTimeoutMilliseconds: 500
overwrite: false

Execute the bootup command separately:

sh %SHARDINGSPHERE_PROXY_HOME%/bin/start.sh 3307
sh %SHARDINGSPHERE_PROXY_HOME%/bin/start.sh 3308
sh %SHARDINGSPHERE_PROXY_HOME%/bin/start.sh 3309

After the three proxy instances are successfully started, the compute node cluster is ready.

SHOW INSTANCE LIST

Use the client to connect to any compute node, such as 3307:

mysql -h 127.0.0.1 -P 3307 -u root -p

View the list of instances using SHOW INSTANCE LIST:

mysql> SHOW INSTANCE LIST;
+----------------+-----------+------+---------+
| instance_id    | host      | port | STATUS  |
+----------------+-----------+------+---------+
| 10.7.5.35@3309 | 10.7.5.35 | 3309 | enabled |
| 10.7.5.35@3308 | 10.7.5.35 | 3308 | enabled |
| 10.7.5.35@3307 | 10.7.5.35 | 3307 | enabled |
+----------------+-----------+------+---------+

The above fields mean:

  • instance_id: The id of the instance, which is currently composed of host and port
  • host: Host address
  • port: Port number
  • status: The status of the instance, either enabled or disabled
DISABLE INSTANCE

Use a DISABLE INSTANCE statement to set the specified compute node to a disabled state. The statement does not terminate the process of the target instance but only virtually deactivates it.

DISABLE INSTANCE supports the following syntax forms:

DISABLE INSTANCE 10.7.5.35@3308;
#or
DISABLE INSTANCE IP=10.7.5.35, PORT=3308;

For example:

mysql> DISABLE INSTANCE 10.7.5.35@3308;
Query OK, 0 ROWS affected (0.02 sec)
mysql> SHOW INSTANCE LIST;
+----------------+-----------+------+----------+
| instance_id    | host      | port | STATUS   |
+----------------+-----------+------+----------+
| 10.7.5.35@3309 | 10.7.5.35 | 3309 | enabled  |
| 10.7.5.35@3308 | 10.7.5.35 | 3308 | disabled |
| 10.7.5.35@3307 | 10.7.5.35 | 3307 | enabled  |
+----------------+-----------+------+----------+

After executing the DISABLE INSTANCE statement by querying again, you can see that the instance status of Port 3308 has been updated to disabled, indicating that the compute node has been disabled.

If there is a client connected to 10.7.5.35@3308, executing any SQL statement will prompt an exception:

1000 - Circuit break mode IS ON.

You are not allowed to disable the current compute node. If you send 10.7.5.35@3309 to DISABLE INSTANCE 10.7.5.35@3309, you will receive an exception prompt.

ENABLE INSTANCE

Use an ENABLE INSTANCE statement to set the specified compute node to an enabled state. ENABLE INSTANCE supports the following syntax forms:

ENABLE INSTANCE 10.7.5.35@3308;
#or
ENABLE INSTANCE IP=10.7.5.35, PORT=3308;

For example:

mysql> SHOW INSTANCE LIST;
+----------------+-----------+------+----------+
| instance_id    | host      | port | STATUS   |
+----------------+-----------+------+----------+
| 10.7.5.35@3309 | 10.7.5.35 | 3309 | enabled  |
| 10.7.5.35@3308 | 10.7.5.35 | 3308 | disabled |
| 10.7.5.35@3307 | 10.7.5.35 | 3307 | enabled  |
+----------------+-----------+------+----------+
mysql> ENABLE INSTANCE 10.7.5.35@3308;
Query OK, 0 ROWS affected (0.01 sec)
mysql> SHOW INSTANCE LIST;
+----------------+-----------+------+----------+
| instance_id    | host      | port | STATUS   |
+----------------+-----------+------+----------+
| 10.7.5.35@3309 | 10.7.5.35 | 3309 | enabled  |
| 10.7.5.35@3308 | 10.7.5.35 | 3308 | enabled  |
| 10.7.5.35@3307 | 10.7.5.35 | 3307 | enabled  |
+----------------+-----------+------+----------+

After executing the ENABLE INSTANCE statement, you can query again and view that the instance state of Port 3308 has been restored to enabled.

How to manage compute node parameters

In our article Integrating SCTL into DISTSQL's RAL: Making Apache ShardingSphere perfect for database management, we explained the evolution of ShardingSphere control language (SCTL) to resource and rule administration language (RAL) and the new SHOW VARIABLE and SET VARIABLE syntax.

However, in 5.0.0-Beta, the VARIABLE category of DistSQL RAL only contained only the following three statements:

SET VARIABLE TRANSACTION_TYPE = xx; (LOCAL, XA, BASE)
SHOW VARIABLE TRANSACTION_TYPE;
SHOW VARIABLE CACHED_CONNECTIONS;

By listening to the community's feedback, we noticed that querying and modifying the props configuration of proxy (located in server.yaml) is also a frequent operation. Therefore, we have added support for props configuration in DistSQL RAL since the 5.0.0 GA version.

SHOW VARIABLE

First, we'll review how to configure props:

props:
max-connections-size-per-query: 1

kernel-executor-size: 16  # Infinite by default.

proxy-frontend-flush-threshold: 128  # The default value is 128.

proxy-opentracing-enabled: false

proxy-hint-enabled: false

sql-show: false

check-table-metadata-enabled: false

show-process-list-enabled: false

# Proxy backend query fetch size. A larger value may increase the memory usage of ShardingSphere Proxy.

# The default value is -1, which means set the minimum value for different JDBC drivers.

proxy-backend-query-fetch-size: -1

check-duplicate-table-enabled: false

proxy-frontend-executor-size: 0 # Proxy frontend executor size. The default value is 0, which means let Netty decide.

# Available options of proxy backend executor suitable: OLAP(default), OLTP. The OLTP option may reduce time cost of writing packets to client, but it may increase the latency of SQL execution

# and block other clients if client connections are more than `proxy-frontend-executor-size`, especially executing slow SQL.

proxy-backend-executor-suitable: OLAP

proxy-frontend-max-connections: 0 # Less than or equal to 0 means no limitation.

sql-federation-enabled: false

# Available proxy backend driver type: JDBC (default), ExperimentalVertx

proxy-backend-driver-type: JDBC

Now, you can perform interactive queries by using the following syntax:

SHOW VARIABLE PROXY_PROPERTY_NAME;

For example:

mysql> SHOW VARIABLE MAX_CONNECTIONS_SIZE_PER_QUERY;
+--------------------------------+
| max_connections_size_per_query |
+--------------------------------+
| 1                              |
+--------------------------------+
1 ROW IN SET (0.00 sec)
mysql> SHOW VARIABLE SQL_SHOW;
+----------+
| sql_show |
+----------+
| FALSE    |
+----------+
1 ROW IN SET (0.00 sec)
……

Note: For DistSQL syntax, parameter keys are separated by underscores.

SHOW ALL VARIABLES

Since there are plenty of parameters in proxy, you can also query all parameter values through SHOW ALL VARIABLES:

mysql> SHOW ALL VARIABLES;
+---------------------------------------+----------------+
| variable_name                         | variable_value |
+---------------------------------------+----------------+
| sql_show                              | FALSE          |
| sql_simple                            | FALSE          |
| kernel_executor_size                  | 0              |
| max_connections_size_per_query        | 1              |
| check_table_metadata_enabled          | FALSE          |
| proxy_frontend_database_protocol_type |                |
| proxy_frontend_flush_threshold        | 128            |
| proxy_opentracing_enabled             | FALSE          |
| proxy_hint_enabled                    | FALSE          |
| show_process_list_enabled             | FALSE          |
| lock_wait_timeout_milliseconds        | 50000          |
| proxy_backend_query_fetch_size        | -1             |
| check_duplicate_table_enabled         | FALSE          |
| proxy_frontend_executor_size          | 0              |
| proxy_backend_executor_suitable       | OLAP           |
| proxy_frontend_max_connections        | 0              |
| sql_federation_enabled                | FALSE          |
| proxy_backend_driver_type             | JDBC           |
| agent_plugins_enabled                 | FALSE          |
| cached_connections                    | 0              |
| transaction_type                      | LOCAL          |
+---------------------------------------+----------------+
21 ROWS IN SET (0.01 sec)SET VARIABLE

Dynamic management of resources and rules is a special advantage of DistSQL. Now you can also dynamically update props parameters by using the SET VARIABLE statement. For example:

#Enable SQL log output
SET VARIABLE SQL_SHOW = true;
#Turn on hint function
SET VARIABLE PROXY_HINT_ENABLED = true;
#Open federal query
SET VARIABLE SQL_FEDERATION_ENABLED = true;
……

The SET VARIABLE statement can modify the following parameters, but the new value takes effect only after the proxy restart:

  • kernel_executor_size
  • proxy_frontend_executor_size
  • proxy_backend_driver_type

The following parameters are read-only and cannot be modified:

  • cached_connections

Other parameters will take effect immediately after modification.

How to manage storage nodes

In ShardingSphere, storage nodes are not directly bound to compute nodes. One storage node may play different roles in different schemas at the same time, in order to implement different business logic. Storage nodes are always associated with a schema.

For DistSQL, storage nodes are managed through RESOURCE-related statements, including:

  • ADD RESOURCE
  • ALTER RESOURCE
  • DROP RESOURCE
  • SHOW SCHEMA RESOURCES
Schema preparation

RESOURCE-related statements only work on schemas, so before operating, you need to create and use the USE command to successfully select a schema:

DROP DATABASE IF EXISTS sharding_db;
CREATE DATABASE sharding_db;
USE sharding_db;ADD RESOURCE

ADD RESOURCE supports the following syntax forms:

  • Specify HOST, PORT, DB
ADD RESOURCE resource_0 (
HOST=127.0.0.1,
PORT=3306,
DB=db0,
USER=root,
PASSWORD=root
);
  • Specify URL
ADD RESOURCE resource_1 (
URL="jdbc:mysql://127.0.0.1:3306/db1?serverTimezone=UTC&useSSL=false",
USER=root,
PASSWORD=root
);

The above two syntax forms support the extension parameter PROPERTIES, which is used to specify the attribute configuration of the connection pool between the proxy and the storage node.

For example:

ADD RESOURCE resource_2 (
HOST=127.0.0.1,
PORT=3306,
DB=db2,
USER=root,
PASSWORD=root,
PROPERTIES("maximumPoolSize"=10)
),resource_3 (
URL="jdbc:mysql://127.0.0.1:3306/db3?serverTimezone=UTC&useSSL=false",
USER=root,
PASSWORD=root,
PROPERTIES("maximumPoolSize"=10,"idleTimeout"="30000")
);

Specifying Java Database Connectivity (JDBC) connection parameters, such as useSSL, is supported only with URL form.

ALTER RESOURCE

Use ALTER RESOURCE to modify the connection information of storage nodes, such as changing the size of a connection pool or modifying JDBC connection parameters.

Syntactically, ALTER RESOURCE is identical to ADD RESOURCE.

ALTER RESOURCE resource_2 (
HOST=127.0.0.1,
PORT=3306,
DB=db2,
USER=root,
PROPERTIES("maximumPoolSize"=50)
),resource_3 (
URL="jdbc:mysql://127.0.0.1:3306/db3?serverTimezone=GMT&useSSL=false",
USER=root,
PASSWORD=root,
PROPERTIES("maximumPoolSize"=50,"idleTimeout"="30000")
);

Since modifying the storage node may cause metadata changes or application data exceptions, ALTER RESOURCE cannot be used to modify the target database of the connection. Only the following values can be modified:

  • User name
  • User password
  • PROPERTIES connection pool parameters
  • JDBC parameters
DROP RESOURCE

Use DROP RESOURCE to delete storage nodes from a schema without deleting any data in the storage node. The statement example is as follows:

DROP RESOURCE resource_0, resource_1;

To ensure data correctness, the storage node referenced by the rule cannot be deleted.

t_order is a sharding table, and its actual tables are distributed in resource_0and resource_1. When resource_0 and resource_1 are referenced by t_order sharding rules, they cannot be deleted.

SHOW SCHEMA RESOURCES

SHOW SCHEMA RESOURCES is used to query storage nodes in schemas and supports the following syntax forms:

#Query the storage node in the current schema
SHOW SCHEMA RESOURCES;
#Query the storage node in the specified schema
SHOW SCHEMA RESOURCES FROM sharding_db;

For example, add four storage nodes through the ADD RESOURCE command, and then execute a query:

Image by:

(Jiang Longtao and Lan Chengxiang, CC BY-SA 4.0)

There are many columns in the query result, but here we only show part of them.

Conclusion

In this article, we have introduced you to the ways you can dynamically manage storage nodes through DistSQL.

Unlike modifying YAML files, executing DistSQL statements happens in real time, and there is no need to restart the proxy or compute node, making online operations safer. Changes executed through DistSQL can be synchronized to other compute nodes in the cluster in real time through the register center. The client connected to any compute node can also query changes of storage nodes in real time.

If you have any questions or suggestions about Apache ShardingSphere, please open an issue on the GitHub issue list. If you are interested in contributing to the project, you're very welcome to join the Apache ShardingSphere community.

Apache ShardingSphere Project Links:

This article originally appeared on FAUN and is republished with permission.

A feature update to Apache ShardingSphere enhances the dynamic management of storage nodes.

Image by:

Jason Baker. CC BY-SA 4.0.

Databases What to read next Learn more about distributed databases with ShardingSphere This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

You asked. We acted: Red Hat Customer Portal launches improved technical documentation user experience

Red Hat News - Wed, 08/24/2022 - 12:00

An improved user experience for all technical documentation on the Customer Portal was launched in May of 2022. The redesign rolled out several new features, including an all-new reading mode, expanding tables and an overhaul of the navigation and layout. The best part about the redesign? It was driven by customer feedback.

Google Posts Updated Encrypted Hibernation Patches For Linux

Phoronix - Wed, 08/24/2022 - 07:27
Back in May there was a patch series by Google engineers working on encrypted hibernation support for Linux that would be protected by the platform hardware itself like with a TPM module as well as user authentication by a password or other means. Sent out today is a second revision to that Linux encrypted hibernation support...

AMD's New PMF CPU Linux Driver Now Preparing For "CnQF"

Phoronix - Wed, 08/24/2022 - 03:30
As I've written about the past several weeks, AMD engineers have been preparing a Platform Management Framework (PMF) driver for Linux. The AMD Platform Management Framework for future hardware appears similar to Intel's Dynamic Platform and Thermal Framework (DPTF) and designed to enhance the thermal/power performance of future platforms...

Wine-Based CrossOver 22 Released For Enjoying Windows Apps & Games On Linux

Phoronix - Wed, 08/24/2022 - 01:42
CodeWeavers today announced the availability of their Wine-based CrossOver 22 software for enjoying Windows applications and games atop Linux, ChromeOS, and macOS...

Elevate Your Organization’s Open Source Strategy

The Linux Foundation - Wed, 08/24/2022 - 00:58

The role of software, specifically open source software, is more influential than ever and drives today’s innovation. Maintaining and growing future innovation depends on the open source community. Enterprises that understand this are driving transformation and rising to the challenges by boosting their collaboration across industries, understanding how to support their open source developers, and contributing to the open source community.

They realize that success depends on a cohesive, dedicated, and passionate open source community, from hundreds to thousands of individuals. Their collaboration is key to achieving the project’s goals.   It can be challenging to manage all aspects of an open source project considering all the different parts that drive it. For example:

  • Project’s scope and goals
  • Participating members, maintainers, and collaborators
  • Management and governance
  • Legal guidelines and procedures
  • IT services 
  • Source control, CI/CD, distribution, and cloud providers
  • Communication channels and social media

The Linux Foundation’s LFX provides various tools to help open source communities design and adopt a successful project strategy considering all moving parts. So how do they do it? Let’s explore that using the Hyperledger project as an example. 

1. Understand your project’s participation

Through the LFX Individual Dashboard, participants can register the identity they are using to contribute their code to GitHub and Gerrit (Since the Hyperledger project uses both). Then, the tool uses that identity to connect users’ contributions, affiliations, memberships, training, certifications, earned badges, and general information. 

With this information, other LFX tools gather and propagate data charts to help the community visualize their participation in GitHub and Gerrit for the different Hyperledger repositories. It also displays detailed contribution metrics, code participation, and issue participation.  

The LFX Organization Dashboard is a convenient tool to help managers and organizations manage their project memberships, discover similar projects to join, and understand the team’s engagement in the community. In detail, it provides information on:

  • Code contributions
  • Committee members
  • Event speakers and attendees 
  • Training and certification
  • Project enrollments

It is vital to have the project’s members and participant identities organized to understand better how their work makes a difference in the project and how their participation interacts with others toward the project’s goals.  

2. Manage your project’s processes

LFX Project Control Center offers a one-stop portal for program managers to organize their project participation, IT services, and quick access to other LFX tools.

Project managers can also connect:

  • Their project’s source control
  • Issue tracking tool
  • Distribution service
  • Cloud provider
  • Mail lists
  • Meeting management
  • Wiki and hosted domains 

For example, Hyperledger can view all related organizations under their Hyperledger Foundation umbrella, analyze each participant project, and connect services like GitHub, Jira, Confluence, and their communication channels like Groups.io and Twitter accounts.

Managing all the project’s aspects in one place makes it easier for managers to visualize their project scope and better understand how all their services impact the project’s performance.

3. Reach outside and get your project in the spotlight

Social and earned media are vital to ensure your project reaches the ears of its consumers. In addition, it is essential to have good visibility into your project’s influence in the Open Source world and where it is making the best impact.

LFX’s Insights Social Media Metrics provides high-level metrics on a project’s social media account like:

  • Twitter followers and following information 
  • Tweets and retweet breakdown
  • Trending tweets
  • Hashtag breakdown 
  • Contributor and user mentions

In the case of Hyperledger, we have an overall view of their tweet and retweet breakdown. In addition, we can also see how tweets by Bitcoin News are making an impression on the interested communities. 

Insights help you analyze how your project impacts other regions, reaches diverse audiences by language, and adjust communication and marketing strategies to reach out to the sources that open source participants rely on to get the latest information on how the community contributes and engages with others. For example, tweets written in English, Japanese, and Spanish made by Hyperledger contributors are visible in an overall languages chart with direct and indirect impressions calculated.

The bottom line

A coherent open source project strategy is a crucial driver of how enterprises manage their open source programs across their organization and industry. LFX is one of the tools that make enterprise open source programs successful. It is an exclusive benefit for Linux Foundation members and projects. If your organization and project would like to join us, learn more about membership or hosting your project.

The post Elevate Your Organization’s Open Source Strategy appeared first on Linux Foundation.

Pages