Amazon SQS是AWS上支流的音讯队列服务,按理说它是有SDK的,那么为什么还要本人编写客户端呢?因为它提供的SDK太简略,就几个Web API,没有方法间接用。咱们具体来说一说。
SQS SDK中的API,咱们次要用到的也就是getQueueUrl, sendMessage, receiveMessage等。getQueueUrl能依据传入的queueName查找到queueUrl,后续用这个queueUrl来拜访相应的queue(即:调用sendMessage发消息,或调用receiveMessage收音讯)。次要复杂度在于收音讯:这个API是要被动调用的,可是你怎么晓得有没有新音讯须要你去收呢?事实上,这个receiveMessage API是基于拉模式(pull mode)的,你须要轮询来不停地拉取新音讯,这个比拟像Kafka。随之而来的,就须要线程治理,须要一个对SDK做了进一步包装的客户端库。
Spring Cloud Messaging提供了SQS的客户端库。然而当咱们在2023年3月构建基于SQS的应用程序时,咱们用的是AWS SDK V2,而Spring Cloud Messaging尚未正式反对AWS SDK V2。因而,咱们决定本人编写SQS的客户端库。而且咱们的设计也与Spring Cloud Messaging的有所不同:咱们同时应用多个AWS账号,为此,咱们间接在配置中援用queueUrl(它其实是动态值,可间接援用);而Spring Cloud Messaging只能在配置中援用queueName,而后再运行时获取以后AWS账号中相应的queueUrl。
当初就来讲一讲设计与实现。音讯队列客户端遵循生产者-消费者模型,分为Producer和Consumer。SQS的音讯体必须是不大于256KB的文本,因而能够把音讯体当成一个String。
Producer
Producer很简略,把音讯收回去就行了,顺便对超时和异样做适当的解决。库的用户能够自行决定音讯体的序列化和反序列化形式,咱们不干预这件事。
Producer的应用形式很简略:
new SqsMessageProducer(queueUrl) .produce(yourMessagePayload);
Producer的残缺实现代码大抵如下:
/** How to use: Call produce() with your serialized message string. */public class SqsMessageProducer { private final String queueUrl; private final int timeoutSeconds; private final SqsAsyncClient client; public SqsMessageProducer(String queueUrl, int timeoutSeconds) { this.queueUrl = queueUrl; this.timeoutSeconds = timeoutSeconds; client = new SqsClientFactory().createSqsAsyncClient(); } public void produce(String payload) { var sendMessageFuture = client.sendMessage( SendMessageRequest.builder().queueUrl(queueUrl).messageBody(payload).build()); // 不能有限期待future,要有超时机制 try { sendMessageFuture.get(timeoutSeconds, TimeUnit.SECONDS); } catch (InterruptedException | ExecutionException | TimeoutException e) { throw new ProducerException(e); } } public static class ProducerException extends RuntimeException { public ProducerException(Throwable cause) { super(cause); } }}
如果想进一步提高Producer的性能,能够让它异步获取sendMessageFuture的后果,不必同步期待。然而这么做会升高可靠性,不能保障调用了Producer就肯定胜利发送了音讯,因而须要衡量。
Consumer
Consumer的应用形式很简略,无效利用了函数式编程格调,不须要编写派生类,只须要创立Consumer的实例,传入一个音讯处理函数,而后启动就能够。示例代码如下:
new SqsMessageConsumer(queueUrl, yourCustomizedThreadNamePrefix, yourMessageHandler) .runAsync();
Consumer的实现要简单一些,须要实现音讯驱动的异步计算格调。解决音讯个别会比收取音讯更花工夫,因而它创立一个主循环线程用来轮询音讯队列,创立一个工作线程池用来解决音讯。主循环线程每次可能收到0~n个音讯,把收到的音讯分发给工作线程池来解决。因为工作线程池自带工作队列用于缓冲,所以这两种线程之间是互不阻塞的:如果工作线程慢了,主循环线程能够照常收取和散发新音讯;如果主循环线程慢了,工作线程能够照常解决已有的音讯。
留神一个要点:SQS不会主动清理已被收取的音讯,因为它不晓得你是否胜利解决了音讯。当一个音讯被收取后,它会临时被暗藏,免得其余消费者收到它,如果此音讯始终没有被清理,它会在一段时间后(默认30秒,可配置)从新呈现,被某个消费者再度收取。你须要一个机制来被动告知SQS某条音讯已被解决,这个机制就是deleteMessage API:胜利解决一个音讯后,被动调deleteMessage来从队列中删除此音讯;如果解决失败,什么都不必做,SQS会在一段时间后再次让消费者收取到此音讯。
外围代码这么写:
private volatile boolean shouldShutdown = false;// 只有没有敞开,主循环就始终收取音讯while (!shouldShutdown) { List<Message> messages; try { messages = receiveMessages(); } catch (Throwable e) { logger.error("failed to receive", e); continue; } try { dispatchMessages(queueUrl, messages); } catch (Throwable e) { logger.error("failed to dispatch", e); }}// 收音讯的具体实现private List<Message> receiveMessages() throws ExecutionException, InterruptedException { // visibilityTimeout = message handling timeout // It is usually set at infrastructure level var receiveMessageFuture = client.receiveMessage( ReceiveMessageRequest.builder() .queueUrl(queueUrl) .waitTimeSeconds(10) .maxNumberOfMessages(maxParallelism) .build()); // 下面已在申请中设置waitTimeSeconds=10,所以这里能够不设置超时 return receiveMessageFuture.get().messages();}// 把收到音讯分发给工作线程池做解决// 要显式地把解决好的音讯从队列中删除// 如果不删除,会在将来再次被主循环收取到private void dispatchMessages(String queueUrl, List<Message> messages) { for (Message message : messages) { workerThreadPool.execute( () -> { String messageId = message.messageId(); try { logger.info("Started handling message with id={}", messageId); messageHandler.accept(message); logger.info("Completed handling message with id={}", messageId); // Should delete the succeeded message client.deleteMessage( DeleteMessageRequest.builder() .queueUrl(queueUrl) .receiptHandle(message.receiptHandle()) .build()); logger.info("Deleted handled message with id={}", messageId); } catch (Throwable e) { // Logging is enough. Failed message is not deleted, and will be retried on a future polling. logger.error("Failed to handle message with id=$messageId", e); } }); }}
在以上代码中,每次receiveMessage时设置waitTimeSeconds=10,即最多期待10秒,若没有新音讯就返回0条音讯;若有新音讯,就提前返回所收到的1或多条音讯。之所以不有限期待,是怕网关主动敞开长时间静默的网络连接。
还须要一个优雅敞开机制,让服务器能顺利敞开和清理资源:
Thread mainLoopThread = Thread.currentThread();// JVM awaits all shutdown hooks to complete// https://stackoverflow.com/questions/8663107/how-does-the-jvm-terminate-daemon-threads-or-how-to-write-daemon-threads-that-tRuntime.getRuntime() .addShutdownHook( new Thread( () -> { shouldShutdown = true; mainLoopThread.interrupt(); try { workerThreadPool.shutdown(); boolean terminated = workerThreadPool.awaitTermination(1, TimeUnit.MINUTES); if (!terminated) { List<Runnable> runnables = workerThreadPool.shutdownNow(); logger.info("shutdownNow with {} runnables undone", runnables.size()); } } catch (RuntimeException e) { logger.error("shutdown failed", e); throw e; } catch (InterruptedException e) { logger.error("shutdown interrupted", e); throw new IllegalStateException(e); } }));
有时网络连接不稳固,主循环频繁报错比拟noisy,改成指数退却的重试:
while (!shouldShutdown) { List<Message> messages; try { messages = receiveMessages(); // after success, restore backoff to the initial value receiveBackoffSeconds = 1; } catch (Throwable e) { logger.error("failed to receive", e); logger.info("Gonna sleep {} seconds for backoff", receiveBackoffSeconds); try { //noinspection BusyWait Thread.sleep(receiveBackoffSeconds * 1000L); } catch (InterruptedException ex) { logger.error("backoff sleep interrupted", ex); } // after failure, increment next backoff (≤ limit) receiveBackoffSeconds = exponentialBackoff(receiveBackoffSeconds, 60); continue; } try { dispatchMessages(queueUrl, messages); } catch (Throwable e) { logger.error("failed to dispatch", e); }}private int exponentialBackoff(int current, int limit) { int next = current * 2; return Math.min(next, limit);}
工作线程池是一个ThreadPoolExecutor,应用一个有界的BlockingQueue来实现回压(back-pressure),当这个queue一满,主循环线程就会被迫暂停,以避免本地的音讯积压过多:如果积压过多,既会节约内存,又会导致很多音讯被收取却得不到及时处理,这时还不如让给其余消费者实例去收取。创立工作线程池的相干代码如下:
workerThreadPool = new ThreadPoolExecutor( maxParallelism, maxParallelism, 0, TimeUnit.SECONDS, // bounded queue for back pressure new LinkedBlockingQueue<>(100), new CustomizableThreadFactory(threadPoolPrefix + "-pool-"), new TimeoutBlockingPolicy(30));// Used by workerThreadPoolprivate static class TimeoutBlockingPolicy implements RejectedExecutionHandler { private final long timeoutSeconds; public TimeoutBlockingPolicy(long timeoutSeconds) { this.timeoutSeconds = timeoutSeconds; } @Override public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) { try { BlockingQueue<Runnable> queue = executor.getQueue(); if (!queue.offer(r, this.timeoutSeconds, TimeUnit.SECONDS)) { throw new RejectedExecutionException("Timeout after " + timeoutSeconds + " seconds"); } } catch (InterruptedException e) { throw new IllegalStateException(e); } }}
Consumer的残缺实现代码大抵如下:
/** * How to use: * 1. create a consumer instance with a queue name and a stateless messageHandler function. * 2. call runAsync() method to start listening to the queue. */public class SqsMessageConsumer implements Runnable { private static final Logger logger = LoggerFactory.getLogger(SqsMessageConsumer.class); private final String queueUrl; private final Consumer<Message> messageHandler; private final int maxParallelism; private final SqsAsyncClient client; private final ExecutorService workerThreadPool; private volatile boolean shouldShutdown = false; public SqsMessageConsumer( String queueUrl, String threadPoolPrefix, Consumer<Message> messageHandler) { this(queueUrl, threadPoolPrefix, messageHandler, 8); } public SqsMessageConsumer( String queueUrl, String threadPoolPrefix, Consumer<Message> messageHandler, int maxParallelism) { this.queueUrl = queueUrl; this.messageHandler = messageHandler; this.maxParallelism = maxParallelism; client = new SqsClientFactory().createSqsAsyncClient(); workerThreadPool = new ThreadPoolExecutor( maxParallelism, maxParallelism, 0, TimeUnit.SECONDS, // bounded queue for back pressure new LinkedBlockingQueue<>(100), new CustomizableThreadFactory(threadPoolPrefix + "-pool-"), new TimeoutBlockingPolicy(30)); } /** Use this method by default, it is asynchronous and handles threading for you. */ public void runAsync() { Thread mainLoopThread = new Thread(this); mainLoopThread.start(); } /** * Use this method only if you run it in your own thread pool, it runs synchronously in the * contextual thread. */ @Override public void run() { Thread mainLoopThread = Thread.currentThread(); // JVM awaits all shutdown hooks to complete // https://stackoverflow.com/questions/8663107/how-does-the-jvm-terminate-daemon-threads-or-how-to-write-daemon-threads-that-t Runtime.getRuntime() .addShutdownHook( new Thread( () -> { shouldShutdown = true; mainLoopThread.interrupt(); try { workerThreadPool.shutdown(); boolean terminated = workerThreadPool.awaitTermination(1, TimeUnit.MINUTES); if (!terminated) { List<Runnable> runnables = workerThreadPool.shutdownNow(); logger.info("shutdownNow with {} runnables undone", runnables.size()); } } catch (RuntimeException e) { logger.error("shutdown failed", e); throw e; } catch (InterruptedException e) { logger.error("shutdown interrupted", e); throw new IllegalStateException(e); } })); logger.info("polling loop started"); int receiveBackoffSeconds = 1; // "shouldShutdown" state is more reliable than Thread interrupted state while (!shouldShutdown) { List<Message> messages; try { messages = receiveMessages(); // after success, restore backoff to the initial value receiveBackoffSeconds = 1; } catch (Throwable e) { logger.error("failed to receive", e); logger.info("Gonna sleep {} seconds for backoff", receiveBackoffSeconds); try { //noinspection BusyWait Thread.sleep(receiveBackoffSeconds * 1000L); } catch (InterruptedException ex) { logger.error("backoff sleep interrupted", ex); } // after failure, increment next backoff (≤ limit) receiveBackoffSeconds = exponentialBackoff(receiveBackoffSeconds, 60); continue; } try { dispatchMessages(queueUrl, messages); } catch (Throwable e) { logger.error("failed to dispatch", e); } } } private int exponentialBackoff(int current, int limit) { int next = current * 2; return Math.min(next, limit); } private List<Message> receiveMessages() throws ExecutionException, InterruptedException { // visibilityTimeout = message handling timeout // It has usually been set at infrastructure level var receiveMessageFuture = client.receiveMessage( ReceiveMessageRequest.builder() .queueUrl(queueUrl) .waitTimeSeconds(10) .maxNumberOfMessages(maxParallelism) .build()); // Consumer can wait infinitely for the next message, rely on library default timeout. return receiveMessageFuture.get().messages(); } private void dispatchMessages(String queueUrl, List<Message> messages) { for (Message message : messages) { workerThreadPool.execute( () -> { String messageId = message.messageId(); try { logger.info("Started handling message with id={}", messageId); messageHandler.accept(message); logger.info("Completed handling message with id={}", messageId); // Should delete the succeeded message client.deleteMessage( DeleteMessageRequest.builder() .queueUrl(queueUrl) .receiptHandle(message.receiptHandle()) .build()); logger.info("Deleted handled message with id={}", messageId); } catch (Throwable e) { // Logging is enough. Failed message is not deleted, will be retried at next polling. logger.error("Failed to handle message with id=$messageId", e); } }); } } // Used by workerThreadPool private static class TimeoutBlockingPolicy implements RejectedExecutionHandler { private final long timeoutSeconds; public TimeoutBlockingPolicy(long timeoutSeconds) { this.timeoutSeconds = timeoutSeconds; } @Override public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) { try { BlockingQueue<Runnable> queue = executor.getQueue(); if (!queue.offer(r, this.timeoutSeconds, TimeUnit.SECONDS)) { throw new RejectedExecutionException("Timeout after " + timeoutSeconds + " seconds"); } } catch (InterruptedException e) { throw new IllegalStateException(e); } } }}