Make cloud data analysis more valuable

Working closely with Intel, Kingsoft Cloud has optimized its data center infrastructure and created differentiated cloud analysis services. With KMR (Kingsoft MapReduce), it has won many new customers in the medical services, e-commerce and gaming industries. Despite the fierce competition, the new service will help to retain existing customers and add powerful analytics to their existing data platforms.


In order to win and retain customers, to grow and thrive, cloud service providers must ensure continuous innovation, help customers gain new insights from data, and find ways to create additional value that helps customers drive business growth.


As a leading cloud service provider in China, Kingsoft Cloud provides cloud storage service for variety data such as pictures, videos and sales logs, etc. Customers can use products of Kingsoft Cloud such as cloud servers, huge cloud storage space, load balancing and cloud relational databases to dynamically allocate resources to meet their business requirements. Kingsoft Cloud's cloud storage service ensures the continuity of customer business and improves the total cost of ownership (TCO).


image.png


Currently, Kingsoft Cloud data storage capacity has reached to EB level, and newly added data exceeds 1PB in a single day. Kingsoft Cloud continues to innovate and cooperate with Intel to explore, develop and launch a big data analysis and deep learning service to enhance the performance of the existing Kingsoft cloud storage platform. With this differentiated service, it has gained a competitive advantage in the cloud service market.


Yu Ni, Kingsoft Cloud’s Big Data Senior Architect stated that they have worked with Intel to learn their experience working with other global cloud service providers. He thinks that this has accelerated the learning and allowed Kingsoft Cloud to quickly launch new services and win new customers.


Adding data analysis and deep learning to cloud storage services

Kingsoft Cloud and Intel have established a partnership of in-depth cooperation and innovation several years ago. The two companies jointly discovered a new market opportunity to leverage Intel Xeon processors to enhance Kingsoft's Object Storage Service (Kingsoft Standard Storage Service, KS3 for short) with new analytics. With this enhanced service, customers get practical insights from the data stored in KS3. More importantly, the cost of this new service is low, which not affects its performance.


Kingsoft Cloud and Intel worked together to develop and launch Kingsoft Cloud MapReduce (KMR), a big data analytics platform that helps customers quickly build data analytics clusters and process massive data. Although KMR is based on Apache Hadoop, it can also be used to analyze real-time data through Apache Storm and Apache Kafka. The KS3 is based on object storage and can be queried directly via KMR. As a result, customers can use Apache Hadoop in this way at a lower cost than the block storage that is commonly used in big data analytics.


The combination of KMR and KS3 allows customers to analyze existing data in the cloud. KMR is an enhanced service of KS3, where customers can get more value from existing data and platforms.


KMR is a hosting cluster service built on hundreds of dual Intel Xeon processor server nodes. Each server is deployed with Intel Data Center-class SSDs to provide great data throughput to the processor. Each server uses an Intel 10 Gigabit Ethernet Converged Network Adapter, so cluster nodes can share data with each other, while also ensuring the flexibility and scalability of servers in demanding data center/cloud environments. Besides, it also optimizes parallel workload performance with Intel Transactional Synchronization Extension (Intel TSX).


KMR is based on computing frameworks such as Apache Hadoop and Spark. Customers can use the KMR web-based management console to create clusters, as well as configure kernel, memory, and disk space for virtual machines. Meanwhile, they can choose the number of nodes in the cluster, or add new nodes in the console to gradually expand the cluster. KMR integrates cluster monitoring management tools such as Ambari and Ganglia, which can be configured and used through a web-based management console. In addition, KMR integrates with cloud services such as KS3, Table Database Service and Kingsoft Relational Database Service (KRDS) to provide end-to-end big data solutions. This data can also be stored in the local Hadoop Distributed File System (HDFS).


Due to the end-to-end integration of various cloud services, the Kingsoft Cloud has emerged from the industry. For example, KMR owns the access interface of KS3, so that MapReduce and Apache Spark running in the KMR cluster can directly transfer the data that need to be processed out of KS3 and then write the result back. After realizing end-to-end integration of various cloud services, customers can be able to use KS3 to ensure that raw data and calculation results are permanently stored in the cloud after the cluster is released. KS3 is relatively cheap and the data stored in it is highly reliable. Kingsoft Cloud charges the Apache Hadoop clusters on time, so customers can save their results to KS3 (object cold storage) and release the Apache Hadoop cluster at any time, which helps them save costs.


How to help customers achieve cloud analysis

Kingsoft Cloud and Intel jointly developed and collaborated to promote the successful transformation of its cloud storage services to provide cloud analysis services. This new service has helped Kingsoft Cloud attract many new customers, which has made it gain differentiated advantages in the market competition.


Kingsoft Cloud has won a new customer in the medical field with KMR recently. The hospital has a total of hundreds of terabytes of data, and the Clinical Data Repository (CDR) contains nearly one billion patient diagnostic information. After the deployment of KMR, users can find the information they need within a few milliseconds, and they can quickly obtain the results of the analysis as well. Another new customer runs an e-commerce website, its product demand is significantly affected by the promotion activities, which is difficult to control. After using KMR, it helps to quickly create Storm and Kafka clusters, build real-time data processing systems, and write data processing results to MongoDB databases. The customer now has capability to analyze data, and the built-in cloud can be fast and scalable. The total cost of ownership of this approach is much lower than the solution for the enterprise to build its own system, and the launch speed is much faster too.


The log file size of an online game company is hundreds of gigabytes in total. These files were stored in a MySQL database and required regular queries to generate reports and key performance indicators (KPIs) in the past. The data storage cost in this way is high and the availability of the system is not guaranteed. Also, integrating different data sets must be done manually. After the company uses KMR to store and process log data in a unified manner, it helps to quickly build a cluster environment and expand according to the number of game players. The data now can be stored in the KS3 and queried by KMR, which saves manpower, improves efficiency and greatly reduces storage costs.


It is not hard to see that the reason why customers choose cloud big data processing is that this solution is flexible, efficient, easy to deploy and expand. As it turns out, Kingsoft Cloud's customers prefer flexible cloud services.


Cloud service providers can gain professional knowledge in engineering design through working with Intel, including Apache Hadoop and Spark, as well as other technologies that are widely used in cloud computing. Upgrading to a new generation of Intel Xeon processors can significantly increase throughput and dramatically increase the number of virtual machines on the server. This reduces total cost of ownership and helps cloud service providers get more value out of their data center assets. Kingsoft Cloud is undoubtedly one of the earliest service providers doing this.


Benefiting from long-term cooperation between engineers

Kingsoft Cloud was founded in 2012, and since then Intel has been working with the company to develop detailed specifications for solutions that benchmark and optimize all devices and platforms in the field of computing, storage and networking. The two parties have jointly launched a Software Defined Infrastructure (SDI) to optimize and modernize the infrastructure of the Kingsoft Cloud Data Center.


Kingsoft Cloud has realized standardization based on Intel architecture. The processor used in the second half of 2018 will be upgraded from Intel Xeon processor E5-2690 v4 to Intel Xeon scalable processor 6132, further enhancing the computing performance and scalability of Kingsoft Cloud KMR. Intel engineers used the actual infrastructure to synchronize benchmarking work in Kingsoft Cloud's data center and Intel Labs from the beginning.


Intel engineers and Kingsoft Cloud optimized the company's network together with the Open Source Data Plane Development Kit (DPDK) to help improve the network processing performance of network functions including virtual switches and firewalls.


Meanwhile, Intel used the Intel Data Analytics Acceleration Library (Intel DAAL) to help Kingsoft Cloud improve the performance of deep learning workloads while also leveraging Apache Spark's distributed deep learning library, BigDL, to accelerate the speed of processing.


Intel will continue to work with Kingsoft Cloud to explore greater optimization space and discover other new service opportunities, in order to help Kingsoft Cloud further differentiate its cloud business. Kingsoft Cloud is currently deploying Intel Xeon’s scalable processors and is evaluating other Intel technologies to optimize the performance of machine learning workloads.


What's more, Kingsoft Cloud also collaborated with Intel to develop and launch the Kingsoft Deep Learning Platform (KDL) to help customers gain deeper insights from the data set. This system can provide artificial intelligence cloud services (AIaaS) to help customers use application programming interfaces (APIs) for deep learning training and reasoning; it can also be used in scenarios such as image analysis, image recognition, video recognition and speech recognition. Based on a container, KDL supports TensorFlow, Caffe, and MXNet deep learning frameworks. Customers don't need to build a deep learning environment to focus on building deep learning models and running training and reasoning workloads. All data can be stored in KS3. Intel helped Kingsoft optimize the deep learning performance on Intel Xeon processors. Kingsoft Cloud is able to use the Intel Xeon Scalable Platform to significantly enhance the performance of KDL. Based on Intel Xeon's scalable processor 6132, Intel Caffe replaces the standard version of Caffe for optimization, and also uses 8 cores for online reasoning of ResNet50, which brings more than 40 times performance improvement.