< Sergii Kostenko's blog

Improve Oracle JDBC performance by fetch size tuning

28 December 2020

By default, when Oracle JDBC driver executes query, it retrieves a result set of 10 rows at a time from the database cursor. Low fetch size value might cause more roundtrips to DB and this leads to a longer time to fetch results from queries. You can change the number of rows retrieved with each trip to the database cursor by changing the row fetch size value.

Statement, PreparedStatement, CallableStatement, and ResultSet provides next methods for dealing with fetch size:

void setFetchSize(int rows) throws SQLException

int getFetchSize() throws SQLException

Default fetch size value can be changed by defaultRowPrefetch connection property:

On Wildfly Application Server DataSource level by:

[standalone@localhost:9990 /] /subsystem=datasources/data-source=ExampleOraDS/connection-properties=defaultRowPrefetch:add(value=1000)

On Hibernate level by hibernate.jdbc.fetch_size property:

<properties>
  ...
  <property name="hibernate.jdbc.fetch_size" value="1000" />
  ...
</properties>

I did simple test:

@Test
public void defaultRowPrefetchTest() throws Exception {
   EntityManager em = Persistence.createEntityManagerFactory("myDSTestOra").createEntityManager();

   Long time = System.currentTimeMillis();

   Query q = em.createNativeQuery("SELECT * FROM MY_TABLE", Tuple.class);
   List<Tuple> resultList = q.getResultList();

   System.out.println(System.currentTimeMillis() - time);
}

And on my laptop, fetching of 16K records takes ~185 ms with default value and ~86 ms with defaultRowPrefetch = 20000. As you can see from the result - there is more than x2 performance improvement.

Source code of test case on GitHub

Comments


Microprofile metrics with Wildfly Application Server

27 December 2020

Any enterprise application can't be completely successful on production without good monitoring solution. For years vendors and developers provided custom tooling for it. Since Eclipse Microprofile Metrics specification we have a unified way to export monitoring data to the management agents and unified Java API, that developers can use to expose their telemetry data.

Wildfly application server provides microprofile metrics support, but unfortunately only for standalone configurations yet. In case domain mode you can provide necessary dependencies

dependencies {
    compile group: 'org.eclipse.microprofile.metrics', name: 'microprofile-metrics-api', version: '2.3'
    compile group: 'io.smallrye', name: 'smallrye-metrics', version: '2.4.0'
}

and then expose application scope metrics through custom endpoint like

import io.smallrye.metrics.exporters.JsonExporter;
import io.smallrye.metrics.exporters.OpenMetricsExporter;
...
@Singleton
@Path("/metrics")
public class MetricsTestResource {

  private OpenMetricsExporter openMetricsExporter = new OpenMetricsExporter();
  private JsonExporter jsonExporter = new JsonExporter();

  @GET
  @Path("/prmths")
  public String prometheus() {
    return openMetricsExporter.exportAllScopes().toString();
  }

  @GET
  @Path("/json")
  public String json() {
    return jsonExporter.exportAllScopes().toString();
  }

JVM and subsystems metrics will not be available by endpoint above, but them you can obtain through old good JMX.

Standalone server from the box provides metrics in prometheus format for all scopes over management interface (port 9990 ) using org.wildfly.extension.microprofile.metrics-smallrye extension and microprofile-metrics-smallrye subsystem.

kostenko@kostenko:$ curl http://127.0.0.1:9990/metrics/
# HELP base_classloader_loadedClasses_count Displays the number of classes that are currently loaded in the Java virtual machine.
# TYPE base_classloader_loadedClasses_count gauge
base_classloader_loadedClasses_count 11826.0
# HELP base_cpu_availableProcessors Displays the number of processors available to the Java virtual machine. This value may change during a particular invocation of the virtual machine.
# TYPE base_cpu_availableProcessors gauge
base_cpu_availableProcessors 8.0
...

For developers available next annotations (sorry for the low output examples values):

Prometheus is a free software application used for event monitoring and alerting. It records real-time metrics in a time series database (allowing for high dimensionality) built using a HTTP pull model, with flexible queries and real-time alerting.

Let's setup above and check how metrics monitoring with Prometheus looks on practice:

wget https://github.com/prometheus/prometheus/releases/download/v2.23.0/prometheus-2.23.0.linux-amd64.tar.gz
tar xvfz prometheus-*.tar.gz
cd prometheus-*

To provide path to the metrics endpoint edit prometheus.yml and provide correct metrics_path and targets

# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['127.0.0.1:9990']

This is it! http://localhost:9090/graph :
wildfly-microprofile-metrics

Now our metrics is collecting and can be visualized over standard prometheus UI(shown above) or easy integrated with grafana

Source code of custom metrics endpoint example available on GitHub

Comments


Infinispan Server as Wildfly remote cache container for your Jakarta EE application

29 November 2020

pager

Recently i wrote a few articles about using infinispan cache based on Wildfly infinispan subsystem. But even though Wildfly provides well cache containers management support, - from the high load and high availability points of view, make sense to take a look to separate clustered cache instances.

PROS:

CONS:

Fortunately, with Wildfly Application Server it easy enough to switch between embedded and remote cache containers even in runtime (just another JNDI lookup). So, let's try it out! And first, we need to download stable infinispan server release. I have chosen 10.1.8 as my Wildfly 20 uses this one and potential compatibility issues should be excluded.

After download, please extract distribution archive and run infinispan server

kostenko@kostenko:/opt/infinispan-server-10.1.8.Final/bin$ ./server.sh

By default infinispan server will use port 11222 on 127.0.0.1. To bind another IP just use -b binding parameter like -b 0.0.0.0 on startup.

To create named cache you can use provided UI (http://127.0.0.1:11222/) or cli console like

/opt/iplatform/infinispan/bin/cli.sh
[disconnected]> connect
create cache --template=org.infinispan.REPL_ASYNC myremotecache

Now let's perform Wildfly configuration to use remote cache container

/socket-binding-group=standard-sockets/remote-destination-outbound-socket-binding=ispn1:add(host=127.0.0.1, port=11222)
batch
/subsystem=infinispan/remote-cache-container=myRemoteContainer:add(default-remote-cluster=data-grid-cluster)
/subsystem=infinispan/remote-cache-container=myRemoteContainer/remote-cluster=data-grid-cluster:add(socket-bindings=[ispn1])
run-batch

Actually, we just have finished with environment configuration and now we are ready for application development. As usual, Jakarta EE build.gradle looks pretty laconical:

apply plugin: 'war'
dependencies {
    providedCompile "jakarta.platform:jakarta.jakartaee-api:8.0.0"
    providedCompile "org.infinispan:infinispan-core:10.1.8.Final"
    providedCompile "org.infinispan:infinispan-client-hotrod:10.1.8.Final"
}

To use configured cache container just inject registered @Resource:

@Named
public class TestCacheService {

    public static final String REMOTE_CACHE_NAME = "myremotecache";

    @Resource(lookup = "java:jboss/infinispan/remote-container/myRemoteContainer")
    org.infinispan.client.hotrod.RemoteCacheContainer remoteCacheContainer;

    public void putRemoteCache(String key, String value) {
        remoteCacheContainer.getCache(REMOTE_CACHE_NAME).put(key, String.format("%s (%s)", value, new Date()));
    }

    public Object getRemoteCache(String key) {
        return remoteCacheContainer.getCache(REMOTE_CACHE_NAME).get(key);
    }
}

Also, you can provide resource reference by WEB-INF/web.xml descriptor and use shorter resource lookup by name like @Resource(name = "myremotecontainer")

<resource-env-ref>
    <resource-env-ref-name>myremotecontainer</resource-env-ref-name>
    <lookup-name>java:jboss/infinispan/remote-container/myRemoteContainer</lookup-name>
</resource-env-ref>

Last thing we need, - is provide module dependencies by MANIFEST.MF:

Manifest-Version: 1.0
Dependencies: org.infinispan, org.infinispan.commons, org.infinispan.client.hotrod export

OR through jboss-deployment-structure.xml :

<jboss-deployment-structure>
   <deployment>
      <dependencies>
         <module name="org.infinispan" export="TRUE" />
         <module name="org.infinispan.commons" export="TRUE" />
         <module name="org.infinispan.client.hotrod" export="TRUE" />
      </dependencies>
   </deployment>
</jboss-deployment-structure>

This is it! Build, deploy, and test it out.

curl -o - "http://localhost:8080/jcache-examples/jcache/ispn-remote-put?key=KEY1&value=VALUE1"
ok
curl -o - "http://localhost:8080/jcache-examples/jcache/ispn-remote-get?key=KEY1"
VALUE1 (Sat Nov 28 20:48:51 EET 2020)

To check remote cache container statistics you can use UI or Infinispan CLI console:

[disconnected]> connect
cd caches
stats myremotecache
{
  "time_since_start" : 23866,
  "time_since_reset" : 23866,
  "current_number_of_entries" : 1,
  "current_number_of_entries_in_memory" : 1,
  "total_number_of_entries" : 1,
  "off_heap_memory_used" : 0,
  ...

Last point i would like to pay attention is cache container height availability with Infinispan clustering. By default, Infinispan uses MPING (multicast) protocol to cluster auto discovery. You can easy check it just by running another ISPN instances on some network. For example:

$ cd <ISPN_HOME>
$ cp -r server server2
$ bin/server.sh -o 100 -s server2

$ bin/cli.sh
connect
describe
{
  "version" : "10.1.8.Final",
  ...
  "cluster_members_physical_addresses" : [ "127.0.0.1:7800", "127.0.0.1:7801" ],
  "cluster_size" : 2,
  ...
}

Do not forget to add new ISPN node to your Wildfly configuration

/socket-binding-group=standard-sockets/remote-destination-outbound-socket-binding=ispn2:add(host=127.0.0.1, port=11322)
/subsystem=infinispan/remote-cache-container=myRemoteContainer/remote-cluster=data-grid-cluster:write-attribute(name=socket-bindings, value=[ispn1,ispn2])

Please, notice if you perform cloud deployment or have some network restrictions, - auto discovery with MPING can be not accessible. In this case you can use a static list of IP addresses by providing TCPPING configuration via server/conf/infinispan.xml. Just add jgroups section and edit transport stack for default cache-container :

<infinispan>

 <jgroups>
    <stack name="mytcpping">
      <TCP bind_port="7800" port_range="30" recv_buf_size="20000000" send_buf_size="640000"/>
      <TCPPING   initial_hosts="${jgroups.tcpping.initial_hosts:127.0.0.1[7800],127.0.0.1[7800]}"/>
      <MERGE3 />
      <FD_SOCK />
      <FD_ALL timeout="3000" interval="1000" timeout_check_interval="1000" />
      <VERIFY_SUSPECT timeout="1000" />
      <pbcast.NAKACK2 use_mcast_xmit="false" xmit_interval="100" xmit_table_num_rows="50" xmit_table_msgs_per_row="1024" xmit_table_max_compaction_time="30000" />
      <UNICAST3 xmit_interval="100" xmit_table_num_rows="50" xmit_table_msgs_per_row="1024" xmit_table_max_compaction_time="30000" />
      <pbcast.STABLE stability_delay="200" desired_avg_gossip="2000" max_bytes="1M" />
      <pbcast.GMS print_local_addr="false" join_timeout="${jgroups.join_timeout:2000}" />
      <UFC max_credits="4m" min_threshold="0.40" />
      <MFC max_credits="4m" min_threshold="0.40" />
      <FRAG3 />
    </stack>
  </jgroups>

   <cache-container name="default" statistics="true">
     <transport stack="mytcpping" node-name="${infinispan.node.name:}"/>
   </cache-container>
...

For more details about configuration, please refer to WildFly 20 Infinispan Model Reference and Infinispan community documentation

Source code of described example available on GitHub

Comments


ORA-01795 and JPA function workaround

27 November 2020

Few posts ago i wrote about Hibernate Interceptor to solve ORA-01795: maximum number of expressions in a list is 1000 error. This way can be very helpful in case you got this limitation, but by some reasons not able to perform refactoring.

Another way to get it done with JPQL is a JPA function(). We used similar approach to implement JPA paging with COUNT(*) OVER().

So, let's see how less code we need to get it work with custom Dialect workaround.

Custom dialect:

public class MyOraDialect extends Oracle10gDialect {
    public MyOraDialect () {
        super();
        // sql tuples workaround
        registerFunction( "safeTupleIn", new VarArgsSQLFunction( StandardBasicTypes.INTEGER, "(", ",0),(", ",0)"));
        // custom SQLFunction workaround
        registerFunction( "safeIn", new SafeInFunction());
    }
}

Usage example:

@Test
public void safeTupleIn1000Test() throws Exception {
     EntityManager em = Persistence.createEntityManagerFactory("myDSTestOra").createEntityManager();
     // this.generateTestData(em);
     ...
     Query query =  em.createQuery("SELECT b as post FROM OraBlogEntity b where (id, 0) in (function('safeTupleIn',:ids))", Tuple.class);
     query.setParameter("ids", idsList);
     List<Tuple> tpList = query.getResultList();
     System.out.println(tpList.size());
}

or bit cleaner:

...
Query query =  em.createQuery("SELECT b as post FROM OraBlogEntity b where id in (function('safeIn', id, :ids))", Tuple.class);
query.setParameter("ids", idsList);
...

Result SQL in this case SQL tuples will looks like

Hibernate: select orablogent0_.id as id1_0_, orablogent0_.body as body2_0_, orablogent0_.title as title3_0_ from orablogentity orablogent0_ where (orablogent0_.id , 0) in ((?,0),(?,0),(?....)

In case custom SQLFunction implementation:

Hibernate: select orablogent0_.id as id1_0_, orablogent0_.body as body2_0_, orablogent0_.title as title3_0_ from orablogentity orablogent0_ where orablogent0_.id in (?,...,?) or orablogent0_.id in (?,...,?)

Below is simple example how custom org.hibernate.dialect.function.SQLFunction can be implemented.

public class SafeInFunction implements SQLFunction {
    private final static int IN_CAUSE_LIMIT = 1000;
    ...
    @Override
    public String render(Type firstArgumentType, List arguments, SessionFactoryImplementor factory) throws QueryException {
        final StringBuilder buf = new StringBuilder();
        String fieldName = (String) arguments.get(0);
        for (int i = 1; i < arguments.size(); i++) {
            if (i % IN_CAUSE_LIMIT == 0) {
                buf.deleteCharAt(buf.length() - 1).append(") or ").append(fieldName).append(" in (");
            }
            buf.append("?,");
        }
        return buf.deleteCharAt(buf.length() - 1).toString();
    }
}

PS. Hibernate provides org.hibernate.dialect.Dialect method to overwrite

/**
 * Return the limit that the underlying database places on the number of elements in an {@code IN} predicate.
 * If the database defines no such limits, simply return zero or less-than-zero.
 *
 * @return int The limit, or zero-or-less to indicate no limit.
 */
public int getInExpressionCountLimit() {
  return 0;
}

But unfortunately did not provides properly implementation yet and just throw warning

WARN: HHH000443: Dialect limits the number of elements in an IN predicate to 1000 entries.  However, the given parameter list [ids] contained 1111 entries, which will likely cause failures to execute the query in the database

Source code of described example available on GitHub

Comments


Wildfly/Infinispan domain. Keep distributed cache on separate nodes

03 November 2020

Previously i wrote about development of Jakarta EE application using distributed cache with Wildfly and Infinispan. This solution has a good fit for a small clustered environments where data distribution between nodes will not costs too much. In case you are looking for clustered environment where application scaling should have minimum impact on cache and vice versa, but by some reason you wouldn't like to use a separate infinispan-server cluster as remote-cache-container then next topology can be a solution:
cache-server-group

To make it work we need to configure distributed-cache for two server groups and provide zero capacity-factor for one of them. Below is simple configuration example:

# clone current profile
/profile=full-ha:clone(to-profile=full-ha-cache)
# Create cache server group based on new profile
/server-group=cache-servers:add(profile=full-ha-cache, socket-binding-group=full-ha-sockets)
# Add cache container and distributed cache for both profiles
/profile=full-ha/subsystem=infinispan/cache-container=mycachecontainer:add(statistics-enabled=true)
/profile=full-ha-cache/subsystem=infinispan/cache-container=mycachecontainer:add(statistics-enabled=true)
/profile=ful-ha-cache/subsystem=infinispan/cache-container=mycachecontainer/distributed-cache=mycache:add()
/profile=full-ha/subsystem=infinispan/cache-container=mycachecontainer/distributed-cache=mycache:add()
# Create cache servers
/host=master/server-config=cache1:add(group=cache-servers,socket-binding-port-offset=500)
/host=master/server-config=cache1:start(blocking=true)
/host=master/server-config=cache2:add(group=cache-servers,socket-binding-port-offset=600)
/host=master/server-config=cache2:start(blocking=true)
# Configure ZERO capacity for profile which we will use for application
/profile=full-ha/subsystem=infinispan/cache-container=mycachecontainer/distributed-cache=mycache:write-attribute(name=capacity-factor, value=0)
# Provide transport
/profile=full-ha/subsystem=infinispan/cache-container=mycachecontainer/transport=jgroups:add()
/profile=ful-ha-cache/subsystem=infinispan/cache-container=mycachecontainer/transport=jgroups:add()

Now let's deploy our application on two server groups

deploy jcache-examples.war --server-groups=backend-servers,cache-servers

You can check cached number-of-entries on each server by

/host=master/server=backend1/subsystem=infinispan/cache-container=mycachecontainer/distributed-cache=mycache:read-resource(include-runtime=true)
/host=master/server=cache1/subsystem=infinispan/cache-container=mycachecontainer/distributed-cache=mycache:read-resource(include-runtime=true)

And be sure that application backend-servers will always show "number-of-entries" => 0

Comments