SKS scaling configuration

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

SKS scaling configuration

Jonathon Weiss

Hello all,

I seem to recall that several key server operators are running in a configuration with multiple SKS instances on a single machine, and others with multiple machines running SKS.

Would anyone doing either of these things be willing to share their configurations (especially sksconf and membership)?

        Thanks,
        Jonathon

        Jonathon Weiss <[hidden email]>
        MIT/IS&T/Cloud Platforms

_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Todd Fleisher
Hi Jonathon,
I've previously spoken with Kristian about this off-list in an attempt to improve the performance & resilience of my own server(s) pool(s), so let me share his recommendations which I’ve been using with minimal issues.

The setup uses a caching NGINX server to reduce load on the backend nodes running SKS. His recommendation is to run at least 3 SKS instances in the backend (I’m running 4). Only one of the backend SKS nodes is configured to gossip with the outside world on the WAN, along with the other backend SKS nodes on the LAN. The NGINX proxy is configured to prefer that node (the one gossiping with the outside world - let’s call it the "primary") for stats requests with a much higher weight. As a quick aside, I’ve observed issues in my setup where the stats requests are often directed to the other, internal SKS backend nodes - presumably due to the the primary node timing out due to higher load when gossiping. This then gets cached by the NGINX proxy and continues to get served so my stats page reports only the internal gossip peer’s IP address vs. all of my external peers. If Kristian or anyone else has ideas on how to mitigate/minimize this, please do share. Whenever I check his SKS node @ http://keys2.kfwebs.net:11371/pks/lookup?op=stats I always find it reporting his primary node eta_sks1 with external & internal peers listed.

Here are the relevant NGINX configuration options. Obviously you need to change the server IP addresses & the hostname returned in the headers:

upstream sks_servers
{
       server 192.168.0.55:11372 weight=5;
       server 192.168.0.61:11371 weight=10;
       server 192.168.0.36:11371 weight=10;
}

upstream sks_servers_primary
{
       server 192.168.0.55:11372 weight=9999;
       server 192.168.0.61:11371 weight=1;
       server 192.168.0.36:11371 weight=1;
}

map $arg_op $upstream_server {
       "stats" sks_servers_primary;
       default sks_servers;
}


And in server context:

proxy_buffering on;
proxy_buffer_size 1k;
proxy_buffers 24 4k;
proxy_busy_buffers_size 8k;
proxy_max_temp_file_size 2048m;
proxy_temp_file_write_size 32k;

 location /pks/hashquery {
               proxy_pass http://sks_servers/pks/hashquery;
               proxy_pass_header  Server;
               add_header Via "1.1 keys2.kfwebs.net";
               proxy_ignore_client_abort on;
       }

       location /pks/add {
               client_max_body_size 5m;
               proxy_pass http://sks_servers/pks/add;
               proxy_pass_header  Server;
               add_header Via "1.1 keys2.kfwebs.net";
               proxy_ignore_client_abort on;
       }

       location /pks {
   proxy_cache backcache;
       proxy_ignore_headers Cache-Control "Expires";
   proxy_cache_valid any 10m;
  # proxy_cache_bypass $http_cache_control;
   add_header X-Proxy-Cache $upstream_cache_status;
               proxy_pass http://sks_servers/pks;
               proxy_pass_header  Server;
               add_header Via "1.1 keys2.kfwebs.net";
               proxy_ignore_client_abort on;
       }


Here is my sksconf file:

#  sksconf -- SKS main configuration
#
#basedir: /etc/sks

# debuglevel 3 is default (max. debuglevel is 10)
debuglevel: 3

nodename: sks03.pod01
hkp_address: 127.0.0.1
hkp_port: 11371
recon_port: 11370
#
server_contact: 0xD16C3A41949D203A
#from_addr: [hidden email]
#sendmail_cmd: /usr/sbin/sendmail -t -oi
#
initial_stat:
membership_reload_interval: 1
stat_hour: 17
#
# set DB file pagesize as recommended by db_tuner
# pagesize is (n * 512) bytes
# NOTE: These must be set _BEFORE_ [fast]build & pbuild and remain set
# for the life of the database files. To change a value requires recreating
# the database from a dump
#
# KDB/key 65536
pagesize: 128
#
# KDB/keyid 32768
keyid_pagesize: 64
#
# KDB/meta 512
meta_pagesize: 1
# KDB/subkeyid 65536
subkeyid_pagesize: 128
#
# KDB/time 65536
time_pagesize: 128
#
# KDB/tqueue 512
tqueue_pagesize: 1
#
# KDB/word - db_tuner suggests 512 bytes. This locked the build process
# Better to use a default of 8 (4096 bytes) for now
#word_pagesize: 8
#
# PTree/ptree 4096
ptree_pagesize: 8

disable_mailsync:

# Adding to try and address poison keys
command_timeout: 600
wserver_timeout: 30
max_recover: 150


As for my membership file(s), I try to keep all of them identical, just commenting out the various peers that are not needed outside of the primary node. So my primary node (10.x.x.207) would look like this:

10.x.x.226 11370
10.x.x.217 11370
#10.x.x.207 11370
10.x.x.221 11370
#10.y.x.30 11370
#10.y.x.31 11370
#10.y.x.32 11370
#10.y.x.26 11370
pgp.librelabucm.org 11370 # LibreLabUCM [hidden email] 0x6FC10EAE0B5C3FC4
agora.cenditel.gob.ve 11370 # Lully Troconis <[hidden email]> 0x4758944f58aad9e1
pgpkeys.co.uk 11370 # Daniel Austin <[hidden email]> 0x34A3662F837F2C28
fks.pgpkeys.eu 11370 # Daniel Austin <[hidden email]> 0x34A3662F837F2C28
sks.infcs.de 11370 # Steffen Kaiser <[hidden email]> 5119CB3603B258AAC1EBA7A723A371DE9ABC764F
pgpkeys.urown.net 11370 # <[hidden email]> 0x27A69FC9A1744242


While one of my non-primary nodes (10.x.x.226) would look like this:

#10.101.7.226 11370
#10.101.7.217 11370
10.101.7.207 11370
#10.101.7.221 11370
#10.102.10.30 11370
#10.102.10.31 11370
#10.102.10.32 11370
#10.102.10.26 11370
#pgp.librelabucm.org 11370 # LibreLabUCM [hidden email] 0x6FC10EAE0B5C3FC4
#agora.cenditel.gob.ve 11370 # Lully Troconis <[hidden email]> 0x4758944f58aad9e1
#pgpkeys.co.uk 11370 # Daniel Austin <[hidden email]> 0x34A3662F837F2C28
#fks.pgpkeys.eu 11370 # Daniel Austin <[hidden email]> 0x34A3662F837F2C28
#sks.infcs.de 11370 # Steffen Kaiser <[hidden email]> 5119CB3603B258AAC1EBA7A723A371DE9ABC764F
#pgpkeys.urown.net 11370 # <[hidden email]> 0x27A69FC9A1744242


Another important item that was recently discussed on the list is to ensure you build your SKS DB with a DB_CONFIG file in your DB & PTree sub-directories to avoid issues with DB logfiles building up:

set_mp_mmapsize         268435456
set_cachesize 0 134217728 1
set_flags DB_LOG_AUTOREMOVE
set_lg_regionmax 1048576
set_lg_max 104857600
set_lg_bsize 2097152
set_lk_detect DB_LOCK_DEFAULT
set_tmp_dir /tmp
set_lock_timeout 1000
set_txn_timeout 1000
mutex_set_max 65536

Regarding the sizing of virtual machines that host the SKS nodes. I originally provisioned mine as 4 VCPU, 4 GB of RAM, & 50 GB of disk storage (currently showing ~20GB used). However, with the recent additions to my sksconf file to try and address poison keys I found it necessary to increase them to 8 GB of RAM to avoid swapping. I could probably down size the VCPU count, as I rarely see that spike above 50% (out of 400%). I use Ubuntu Bionic (18.04 LTS), only after installing the SKS package that ships with that release I replace it with the version built to refuse at least one of the poison keys as discussed @ https://lists.nongnu.org/archive/html/sks-devel/2018-07/msg00053.html. The actual link to download the package file is @ https://launchpad.net/~canonical-sysadmins/+archive/ubuntu/sks-public/+packages. Hopefully one day it will be promoted into to ship by default without having to download and install manually. 

I think that about covers it, but if you have any questions or notice any mistakes please let me know. Hopefully you and others will find this to be a useful, one-stop-shop resource for setting up a more solid pool of SKS servers.

Cheers,
-T

On Feb 14, 2019, at 4:34 PM, Jonathon Weiss <[hidden email]> wrote:


Hello all,

I seem to recall that several key server operators are running in a configuration with multiple SKS instances on a single machine, and others with multiple machines running SKS.

Would anyone doing either of these things be willing to share their configurations (especially sksconf and membership)?

Thanks,
Jonathon

Jonathon Weiss <[hidden email]>
MIT/IS&T/Cloud Platforms

_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel



_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Michiel van Baak
On Sun, Feb 17, 2019 at 09:18:11AM -0800, Todd Fleisher wrote:

> The setup uses a caching NGINX server to reduce load on the backend nodes running SKS.
> His recommendation is to run at least 3 SKS instances in the backend (I’m running 4).
> Only one of the backend SKS nodes is configured to gossip with the outside world on the WAN, along with the other backend SKS nodes on the LAN.
> The NGINX proxy is configured to prefer that node (the one gossiping with the outside world - let’s call it the "primary") for stats requests with a much higher weight.
> As a quick aside, I’ve observed issues in my setup where the stats requests are often directed to the other, internal SKS backend nodes - presumably due to the the primary node timing out due to higher load when gossiping.
> This then gets cached by the NGINX proxy and continues to get served so my stats page reports only the internal gossip peer’s IP address vs. all of my external peers.
> If Kristian or anyone else has ideas on how to mitigate/minimize this, please do share.
> Whenever I check his SKS node @ http://keys2.kfwebs.net:11371/pks/lookup?op=stats <http://keys2.kfwebs.net:11371/pks/lookup?op=stats> I always find it reporting his primary node eta_sks1 with external & internal peers listed.
>
> Here are the relevant NGINX configuration options. Obviously you need to change the server IP addresses & the hostname returned in the headers:
>
> upstream sks_servers
> {
>        server 192.168.0.55:11372 weight=5;
>        server 192.168.0.61:11371 weight=10;
>        server 192.168.0.36:11371 weight=10;
> }
>
> upstream sks_servers_primary
> {
>        server 192.168.0.55:11372 weight=9999;
>        server 192.168.0.61:11371 weight=1;
>        server 192.168.0.36:11371 weight=1;
> }

I would only put the 55 server in the 'upstream sks_servers_primary' so
it does not know about the others.
That way the stats call will only go to the primary.
Downside is that it wont fail over when it times out. But maybe that is
exactly what you want for this specific call

--
Michiel van Baak
[hidden email]
GPG key: http://pgp.mit.edu/pks/lookup?op=get&search=0x6FFC75A2679ED069

NB: I have a new GPG key. Old one revoked and revoked key updated on keyservers.

_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Jeremy T. Bouse
In reply to this post by Todd Fleisher

Hi Todd,

    The timing of this thread and your reply are ideal as I'm in the process of working to fix my cluster that has been down for some time due to system failure and lack of available time on my part to repair it. Since I'm in the process I've been working to revisit the setup itself.

    I didn't have as many locations configured as you show in your example but it looked like you were defining the map but I didn't see it being used in any of your location blocks unless I'm missing something. Shouldn't you be using $upstream_server in your proxy_pass configuration?

    Having only one of the nodes gossiping outside the cluster definitely simplifies things compared to how I had things setup before, especially when moving the servers behind a firewall.

On 2/17/2019 12:18 PM, Todd Fleisher wrote:
Hi Jonathon,
I've previously spoken with Kristian about this off-list in an attempt to improve the performance & resilience of my own server(s) pool(s), so let me share his recommendations which I’ve been using with minimal issues.

The setup uses a caching NGINX server to reduce load on the backend nodes running SKS. His recommendation is to run at least 3 SKS instances in the backend (I’m running 4). Only one of the backend SKS nodes is configured to gossip with the outside world on the WAN, along with the other backend SKS nodes on the LAN. The NGINX proxy is configured to prefer that node (the one gossiping with the outside world - let’s call it the "primary") for stats requests with a much higher weight. As a quick aside, I’ve observed issues in my setup where the stats requests are often directed to the other, internal SKS backend nodes - presumably due to the the primary node timing out due to higher load when gossiping. This then gets cached by the NGINX proxy and continues to get served so my stats page reports only the internal gossip peer’s IP address vs. all of my external peers. If Kristian or anyone else has ideas on how to mitigate/minimize this, please do share. Whenever I check his SKS node @ http://keys2.kfwebs.net:11371/pks/lookup?op=stats I always find it reporting his primary node eta_sks1 with external & internal peers listed.

Here are the relevant NGINX configuration options. Obviously you need to change the server IP addresses & the hostname returned in the headers:

upstream sks_servers
{
       server 192.168.0.55:11372 weight=5;
       server 192.168.0.61:11371 weight=10;
       server 192.168.0.36:11371 weight=10;
}

upstream sks_servers_primary
{
       server 192.168.0.55:11372 weight=9999;
       server 192.168.0.61:11371 weight=1;
       server 192.168.0.36:11371 weight=1;
}

map $arg_op $upstream_server {
       "stats" sks_servers_primary;
       default sks_servers;
}


And in server context:

proxy_buffering on;
proxy_buffer_size 1k;
proxy_buffers 24 4k;
proxy_busy_buffers_size 8k;
proxy_max_temp_file_size 2048m;
proxy_temp_file_write_size 32k;

 location /pks/hashquery {
               proxy_pass http://sks_servers/pks/hashquery;
               proxy_pass_header  Server;
               add_header Via "1.1 keys2.kfwebs.net";
               proxy_ignore_client_abort on;
       }

       location /pks/add {
               client_max_body_size 5m;
               proxy_pass http://sks_servers/pks/add;
               proxy_pass_header  Server;
               add_header Via "1.1 keys2.kfwebs.net";
               proxy_ignore_client_abort on;
       }

       location /pks {
   proxy_cache backcache;
       proxy_ignore_headers Cache-Control "Expires";
   proxy_cache_valid any 10m;
  # proxy_cache_bypass $http_cache_control;
   add_header X-Proxy-Cache $upstream_cache_status;
               proxy_pass http://sks_servers/pks;
               proxy_pass_header  Server;
               add_header Via "1.1 keys2.kfwebs.net";
               proxy_ignore_client_abort on;
       }


Here is my sksconf file:

#  sksconf -- SKS main configuration
#
#basedir: /etc/sks

# debuglevel 3 is default (max. debuglevel is 10)
debuglevel: 3

nodename: sks03.pod01
hkp_address: 127.0.0.1
hkp_port: 11371
recon_port: 11370
#
server_contact: 0xD16C3A41949D203A
#from_addr: [hidden email]
#sendmail_cmd: /usr/sbin/sendmail -t -oi
#
initial_stat:
membership_reload_interval: 1
stat_hour: 17
#
# set DB file pagesize as recommended by db_tuner
# pagesize is (n * 512) bytes
# NOTE: These must be set _BEFORE_ [fast]build & pbuild and remain set
# for the life of the database files. To change a value requires recreating
# the database from a dump
#
# KDB/key 65536
pagesize: 128
#
# KDB/keyid 32768
keyid_pagesize: 64
#
# KDB/meta 512
meta_pagesize: 1
# KDB/subkeyid 65536
subkeyid_pagesize: 128
#
# KDB/time 65536
time_pagesize: 128
#
# KDB/tqueue 512
tqueue_pagesize: 1
#
# KDB/word - db_tuner suggests 512 bytes. This locked the build process
# Better to use a default of 8 (4096 bytes) for now
#word_pagesize: 8
#
# PTree/ptree 4096
ptree_pagesize: 8

disable_mailsync:

# Adding to try and address poison keys
command_timeout: 600
wserver_timeout: 30
max_recover: 150


As for my membership file(s), I try to keep all of them identical, just commenting out the various peers that are not needed outside of the primary node. So my primary node (10.x.x.207) would look like this:

10.x.x.226 11370
10.x.x.217 11370
#10.x.x.207 11370
10.x.x.221 11370
#10.y.x.30 11370
#10.y.x.31 11370
#10.y.x.32 11370
#10.y.x.26 11370
pgp.librelabucm.org 11370 # LibreLabUCM [hidden email] 0x6FC10EAE0B5C3FC4
agora.cenditel.gob.ve 11370 # Lully Troconis <[hidden email]> 0x4758944f58aad9e1
pgpkeys.co.uk 11370 # Daniel Austin <[hidden email]> 0x34A3662F837F2C28
fks.pgpkeys.eu 11370 # Daniel Austin <[hidden email]> 0x34A3662F837F2C28
sks.infcs.de 11370 # Steffen Kaiser <[hidden email]> 5119CB3603B258AAC1EBA7A723A371DE9ABC764F
pgpkeys.urown.net 11370 # <[hidden email]> 0x27A69FC9A1744242


While one of my non-primary nodes (10.x.x.226) would look like this:

#10.101.7.226 11370
#10.101.7.217 11370
10.101.7.207 11370
#10.101.7.221 11370
#10.102.10.30 11370
#10.102.10.31 11370
#10.102.10.32 11370
#10.102.10.26 11370
#pgp.librelabucm.org 11370 # LibreLabUCM [hidden email] 0x6FC10EAE0B5C3FC4
#agora.cenditel.gob.ve 11370 # Lully Troconis <[hidden email]> 0x4758944f58aad9e1
#pgpkeys.co.uk 11370 # Daniel Austin <[hidden email]> 0x34A3662F837F2C28
#fks.pgpkeys.eu 11370 # Daniel Austin <[hidden email]> 0x34A3662F837F2C28
#sks.infcs.de 11370 # Steffen Kaiser <[hidden email]> 5119CB3603B258AAC1EBA7A723A371DE9ABC764F
#pgpkeys.urown.net 11370 # <[hidden email]> 0x27A69FC9A1744242


Another important item that was recently discussed on the list is to ensure you build your SKS DB with a DB_CONFIG file in your DB & PTree sub-directories to avoid issues with DB logfiles building up:

set_mp_mmapsize         268435456
set_cachesize 0 134217728 1
set_flags DB_LOG_AUTOREMOVE
set_lg_regionmax 1048576
set_lg_max 104857600
set_lg_bsize 2097152
set_lk_detect DB_LOCK_DEFAULT
set_tmp_dir /tmp
set_lock_timeout 1000
set_txn_timeout 1000
mutex_set_max 65536

Regarding the sizing of virtual machines that host the SKS nodes. I originally provisioned mine as 4 VCPU, 4 GB of RAM, & 50 GB of disk storage (currently showing ~20GB used). However, with the recent additions to my sksconf file to try and address poison keys I found it necessary to increase them to 8 GB of RAM to avoid swapping. I could probably down size the VCPU count, as I rarely see that spike above 50% (out of 400%). I use Ubuntu Bionic (18.04 LTS), only after installing the SKS package that ships with that release I replace it with the version built to refuse at least one of the poison keys as discussed @ https://lists.nongnu.org/archive/html/sks-devel/2018-07/msg00053.html. The actual link to download the package file is @ https://launchpad.net/~canonical-sysadmins/+archive/ubuntu/sks-public/+packages. Hopefully one day it will be promoted into to ship by default without having to download and install manually. 

I think that about covers it, but if you have any questions or notice any mistakes please let me know. Hopefully you and others will find this to be a useful, one-stop-shop resource for setting up a more solid pool of SKS servers.

Cheers,
-T

On Feb 14, 2019, at 4:34 PM, Jonathon Weiss <[hidden email]> wrote:


Hello all,

I seem to recall that several key server operators are running in a configuration with multiple SKS instances on a single machine, and others with multiple machines running SKS.

Would anyone doing either of these things be willing to share their configurations (especially sksconf and membership)?

Thanks,
Jonathon

Jonathon Weiss <[hidden email]>
MIT/IS&T/Cloud Platforms

_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel



_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel

_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Todd Fleisher
On Feb 23, 2019, at 8:35 PM, Jeremy T. Bouse <[hidden email]> wrote:

I didn't have as many locations configured as you show in your example but it looked like you were defining the map but I didn't see it being used in any of your location blocks unless I'm missing something. Shouldn't you be using $upstream_server in your proxy_pass configuration?

I think you’re on to something here. I just tried commenting out the other servers from the upstream sks_servers_primary block and am still seeing stats queries hitting the commented out servers.

Kristian - could you please double check the configuration snippets you provided to me last year and see if something was missing related to this?

-T


_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Jeremy T. Bouse

I ended up with the following NGINX configuration...

in /etc/nginx/conf.d/upstream.conf:

upstream sks_secondary {
    server 127.0.0.1:11371 weight=5;
    server 172.16.20.52:11371 weight=10;
    server 172.16.20.53:11371 weight=10;
    server 172.16.20.54:11371 weight=10;
}

upstream sks_primary {
    server 127.0.0.1:11371;
    server 172.16.20.52:11371 backup;
    server 172.16.20.53:11371 backup;
    server 172.16.20.54:11371 backup;
}

map $arg_op $sks_server {
    "stats" sks_primary;
    default sks_secondary;
}

in /etc/nginx/site-available/sks-default:

server {
    listen 172.16.20.51:80 default_server;
    listen 172.16.20.51:11371 default_server;
    listen [::]:80 ipv6only=on default_server;
    # listen [::]:11371 ipv6only=on default_server;
    access_log off;
    server_tokens off;
    root   /var/www/html;
    index  index.html index.htm;

    proxy_buffering on;
    proxy_buffer_size 1k;
    proxy_buffers 24 4k;
    proxy_busy_buffers_size 8k;
    proxy_max_temp_file_size 2048m;
    proxy_temp_file_write_size 32k;

    location /pks/hashquery {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
    }

    location /pks/add {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
        client_max_body_size 8m;
    }

    location /pks {
        proxy_cache ugns_sks_cache;
        # proxy_cache_background_update on;
        proxy_cache_lock on;
        proxy_cache_min_uses 3;
        proxy_cache_revalidate on;
        proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
        proxy_cache_valid any 10m;
        proxy_ignore_client_abort on;
        proxy_ignore_headers Cache-Control "Expires";
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
        add_header X-Proxy-Cache $upstream_cache_status;
    }
}

The NGINX configuration appears to be working fine for me... My 3 backend nodes are operating as I expect as well.. The problem I'm seeing exhibited currently is that my primary node which is running along with NGINX seems to be writing an awful lot of log files and not processing them despite having the same DB_CONFIG file as the other 3 nodes. With each of the log files running 100MB it is quickly filling up the drive on that node and then the sks and sks-recon processes simply fall over and crash but the other 3 nodes running behind it keep chugging along just not getting any further gossip input.

I keep seeing the following log entry popping up only on my primary node:

    add_keys_merge failed: Eventloop.SigAlarm

On 2/25/2019 12:37 PM, Todd Fleisher wrote:
On Feb 23, 2019, at 8:35 PM, Jeremy T. Bouse <[hidden email]> wrote:

I didn't have as many locations configured as you show in your example but it looked like you were defining the map but I didn't see it being used in any of your location blocks unless I'm missing something. Shouldn't you be using $upstream_server in your proxy_pass configuration?

I think you’re on to something here. I just tried commenting out the other servers from the upstream sks_servers_primary block and am still seeing stats queries hitting the commented out servers.

Kristian - could you please double check the configuration snippets you provided to me last year and see if something was missing related to this?

-T


_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Jeremy T. Bouse
In reply to this post by Michiel van Baak
So I'll preface this with a caveat that I know a couple of the recipient
mail servers are having some issues with my DMARC/DKIM/SPF settings so I
don't know if everyone is receiving my posts.

I've updated my configuration on sks.undergrid.net using NGINX and
load-balancing 4 SKS nodes... Here are my configs:

Under snippets/sks.conf:

    location /pks/hashquery {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 $host:$server_port (nginx)";
        add_header X-Robots-Tag 'nofollow, notranslate' always;
    }

    location /pks/add {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 $host:$server_port (nginx)";
        add_header X-Robots-Tag 'nofollow, notranslate' always;
        client_max_body_size 8m;
    }

    location /pks {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 $host:$server_port (nginx)";
        add_header X-Robots-Tag 'nofollow, notranslate' always;
    }

Under conf.d/upstream.conf:

upstream sks_secondary {
    server 127.0.0.1:11371 weight=5;
    server 172.16.20.52:11371 weight=25;
    server 172.16.20.53:11371 weight=25;
    server 172.16.20.54:11371 weight=25;
}

upstream sks_primary {
    server 127.0.0.1:11371;
    server 172.16.20.52:11371 backup;
    server 172.16.20.53:11371 backup;
    server 172.16.20.54:11371 backup;
}

map $arg_op $sks_server {
    "stats" sks_primary;
    default sks_secondary;
}

    Of note is the use of "backup" rather than weights for the secondary
nodes under sks_primary. This allows a secondary node to respond for
stats queries when the primary is unable to respond which has typically
been when IO Wait is over 30%.

Under sites-enabled/sks-default:

server {
    listen 172.16.20.51:80 default_server;
    listen [::]:80 ipv6only=on default_server;

    access_log off;
    server_tokens off;
    root   /var/www/html;
    index  index.html index.htm;

    location / {
        return 301 https://sks.undergrid.net$request_uri;
    }

    include snippets/sks.conf;
}

server {
    listen 172.16.20.51:11371 default_server;
    listen [::]:11371 ipv6only=on default_server;

    access_log off;
    server_tokens off;

    location / {
        proxy_ignore_client_abort on;
        proxy_pass http://$sks_server;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_pass_header Server;
        add_header Via "1.1 $host:$server_port (nginx)";
        add_header X-Robots-Tag 'nofollow, notranslate' always;
        client_max_body_size 8m;
    }
}

Under sites-enabled/sks-ugns-ssl:

server {
    listen 443 http2 ssl;
    listen [::]:443 http2 ipv6only=on ssl;
    server_name sks.undergrid.net;

    access_log off;
    server_tokens off;
    root   /var/www/html;
    index  index.html index.htm;

    ssl_certificate /etc/letsencrypt/live/sks.undergrid.net/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/sks.undergrid.net/privkey.pem;
    ssl_trusted_certificate
/etc/letsencrypt/live/sks.undergrid.net/chain.pem;
    ssl_stapling on;
    ssl_stapling_verify on;

    include snippets/sks.conf;
}

I also have a sites-available/sks-default-ssl ready to be enabled:

server {
    listen 443 http2 ssl default_server;
    listen [::]:443 http2 ipv6only=on ssl default_server;
    server_name pool.sks-keyservers.net *.pool.sks-keyservers.net;

    access_log off;
    server_tokens off;
    root   /var/www/html;
    index  index.html index.htm;

    ssl_certificate /path/to/sks-pool-certificate.pem;
    ssl_certificate_key /path/to/sks-pool-certificate-key.pem;
    ssl_trusted_certificate /usr/share/gnupg/sks-keyservers.netCA.pem;
    ssl_session_cache off;
    ssl_session_tickets off;

    location / {
        return 301 https://sks.undergrid.net$request_uri;
    }

    include snippets/sks.conf;
}

    Of note here is that SSL session cache and tickets are disabled to
prevent SSL session resumption on the pool hostnames.

Other than that in my nginx.conf itself I have the following SSL
settings setup:

        ssl_protocols TLSv1.2;
        ssl_prefer_server_ciphers on;
        ssl_dhparam /etc/nginx/dhparam.pem;
        ssl_ciphers EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH;
        ssl_ecdh_curve secp384r1;
        ssl_session_timeout  10m;
        ssl_session_cache shared:SSL:10m;
        ssl_session_tickets on;
        resolver 172.16.20.2 192.168.1.2 valid=300s;
        resolver_timeout 5s;

    Testing on Qualys [1] gives my server an A grade supporting TLS 1.2
with Forward Secrecy. KF, I'll get a CSR sent to you later this week
assuming the server would be accepted back into HKPS pool.

    I'm still dealing with some high IO Wait times on my primary node
which result in my server falling out of the pool quite often although
I'm not sure why the backup stats to from one of the secondary nodes
isn't being picked up, possibly a timeout issue still.

    In trying to test and validate my configuration I did do several
queries against the pool hostnames and got an awful lot of 50[234]
errors along with several HTST headers.

    I still have a delta of about 300 keys or less between all my
secondary nodes and the primary but everything seems to be stable aside
from the IO Wait issues I'm experiencing. I have reduced my membership
file down to my nodes and the ones that I could confirm I was still
cross-peered with (currently only 5 by my count) but many of them also
appear to be bouncing in and out of the pool quite frequently so I'm
wondering if the IO wait issue slowing down SKS processing of stats
requests is not isolated to my setup.

    As soon as I get my new firewall in place I'll be working on adding
the IPv6 support as my current firewall device is not properly obtaining
the route from my provider. I'm also working on the Tor configuration if
someone else has something they could share that would be helpful.

1.
https://www.ssllabs.com/ssltest/analyze.html?d=sks.undergrid.net&hideResults=on

On 2/18/2019 5:12 AM, Michiel van Baak wrote:

> On Sun, Feb 17, 2019 at 09:18:11AM -0800, Todd Fleisher wrote:
>> The setup uses a caching NGINX server to reduce load on the backend nodes running SKS.
>> His recommendation is to run at least 3 SKS instances in the backend (I’m running 4).
>> Only one of the backend SKS nodes is configured to gossip with the outside world on the WAN, along with the other backend SKS nodes on the LAN.
>> The NGINX proxy is configured to prefer that node (the one gossiping with the outside world - let’s call it the "primary") for stats requests with a much higher weight.
>> As a quick aside, I’ve observed issues in my setup where the stats requests are often directed to the other, internal SKS backend nodes - presumably due to the the primary node timing out due to higher load when gossiping.
>> This then gets cached by the NGINX proxy and continues to get served so my stats page reports only the internal gossip peer’s IP address vs. all of my external peers.
>> If Kristian or anyone else has ideas on how to mitigate/minimize this, please do share.
>> Whenever I check his SKS node @ http://keys2.kfwebs.net:11371/pks/lookup?op=stats <http://keys2.kfwebs.net:11371/pks/lookup?op=stats> I always find it reporting his primary node eta_sks1 with external & internal peers listed.
>>
>> Here are the relevant NGINX configuration options. Obviously you need to change the server IP addresses & the hostname returned in the headers:
>>
>> upstream sks_servers
>> {
>>        server 192.168.0.55:11372 weight=5;
>>        server 192.168.0.61:11371 weight=10;
>>        server 192.168.0.36:11371 weight=10;
>> }
>>
>> upstream sks_servers_primary
>> {
>>        server 192.168.0.55:11372 weight=9999;
>>        server 192.168.0.61:11371 weight=1;
>>        server 192.168.0.36:11371 weight=1;
>> }
> I would only put the 55 server in the 'upstream sks_servers_primary' so
> it does not know about the others.
> That way the stats call will only go to the primary.
> Downside is that it wont fail over when it times out. But maybe that is
> exactly what you want for this specific call
>

_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Jonathon Weiss
In reply to this post by Jeremy T. Bouse
Thanks for the information everyone.  A further question:  I saw the advice of a minimum of three servers.  Anyone know how that was arrived at, or if there is a recommendation on how many queries an individual SKS back-end can handle?

  Jonathon

  Jonathon Weiss <[hidden email]>
  MIT/IS&T/Cloud Platforms




On Fri, 1 Mar 2019, Jeremy T. Bouse wrote:

>
> I ended up with the following NGINX configuration...
>
> in /etc/nginx/conf.d/upstream.conf:
>
> upstream sks_secondary {
>     server 127.0.0.1:11371 weight=5;
>     server 172.16.20.52:11371 weight=10;
>     server 172.16.20.53:11371 weight=10;
>     server 172.16.20.54:11371 weight=10;
> }
>
> upstream sks_primary {
>     server 127.0.0.1:11371;
>     server 172.16.20.52:11371 backup;
>     server 172.16.20.53:11371 backup;
>     server 172.16.20.54:11371 backup;
> }
>
> map $arg_op $sks_server {
>     "stats" sks_primary;
>     default sks_secondary;
> }
>
> in /etc/nginx/site-available/sks-default:
>
> server {
>     listen 172.16.20.51:80 default_server;
>     listen 172.16.20.51:11371 default_server;
>     listen [::]:80 ipv6only=on default_server;
>     # listen [::]:11371 ipv6only=on default_server;
>     access_log off;
>     server_tokens off;
>     root   /var/www/html;
>     index  index.html index.htm;
>
>     proxy_buffering on;
>     proxy_buffer_size 1k;
>     proxy_buffers 24 4k;
>     proxy_busy_buffers_size 8k;
>     proxy_max_temp_file_size 2048m;
>     proxy_temp_file_write_size 32k;
>
>     location /pks/hashquery {
>         proxy_ignore_client_abort on;
>         proxy_pass http://$sks_server;
>         proxy_set_header Host $host;
>         proxy_set_header X-Real-IP $remote_addr;
>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>         proxy_set_header X-Forwarded-Proto $scheme;
>         proxy_set_header X-Forwarded-Port $server_port;
>         proxy_pass_header Server;
>         add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
>     }
>
>     location /pks/add {
>         proxy_ignore_client_abort on;
>         proxy_pass http://$sks_server;
>         proxy_set_header Host $host;
>         proxy_set_header X-Real-IP $remote_addr;
>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>         proxy_set_header X-Forwarded-Proto $scheme;
>         proxy_set_header X-Forwarded-Port $server_port;
>         proxy_pass_header Server;
>         add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
>         client_max_body_size 8m;
>     }
>
>     location /pks {
>         proxy_cache ugns_sks_cache;
>         # proxy_cache_background_update on;
>         proxy_cache_lock on;
>         proxy_cache_min_uses 3;
>         proxy_cache_revalidate on;
>         proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
>         proxy_cache_valid any 10m;
>         proxy_ignore_client_abort on;
>         proxy_ignore_headers Cache-Control "Expires";
>         proxy_pass http://$sks_server;
>         proxy_set_header Host $host;
>         proxy_set_header X-Real-IP $remote_addr;
>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>         proxy_set_header X-Forwarded-Proto $scheme;
>         proxy_set_header X-Forwarded-Port $server_port;
>         proxy_pass_header Server;
>         add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
>         add_header X-Proxy-Cache $upstream_cache_status;
>     }
> }
>
> The NGINX configuration appears to be working fine for me... My 3 backend nodes are operating as I expect as well.. The problem I'm seeing exhibited currently is that my primary node which is running along with NGINX seems to be
> writing an awful lot of log files and not processing them despite having the same DB_CONFIG file as the other 3 nodes. With each of the log files running 100MB it is quickly filling up the drive on that node and then the sks and
> sks-recon processes simply fall over and crash but the other 3 nodes running behind it keep chugging along just not getting any further gossip input.
>
> I keep seeing the following log entry popping up only on my primary node:
>
>     add_keys_merge failed: Eventloop.SigAlarm
>
> On 2/25/2019 12:37 PM, Todd Fleisher wrote:
>             On Feb 23, 2019, at 8:35 PM, Jeremy T. Bouse <[hidden email]> wrote:
>
> I didn't have as many locations configured as you show in your example but it looked like you were defining the map but I didn't see it being used in any of your location blocks unless I'm missing something. Shouldn't you
> be using $upstream_server in your proxy_pass configuration?
>
> I think you’re on to something here. I just tried commenting out the other servers from the upstream sks_servers_primary block and am still seeing stats queries hitting the commented out servers.
> Kristian - could you please double check the configuration snippets you provided to me last year and see if something was missing related to this?
>
> -T
_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Todd Fleisher
I don’t know if Kristian chose that number based on actual SKS load, since it can be hard to predict how much traffic the various servers in the pool may receive at any given time. That being said, the rule of 3 is pretty standard in operations to prevent a single point of failure from being revealed in the event one of your nodes is unavailable.

-T

> On Mar 4, 2019, at 3:18 PM, Jonathon Weiss <[hidden email]> wrote:
>
> Thanks for the information everyone.  A further question:  I saw the advice of a minimum of three servers.  Anyone know how that was arrived at, or if there is a recommendation on how many queries an individual SKS back-end can handle?
>
> Jonathon
>
> Jonathon Weiss <[hidden email]>
> MIT/IS&T/Cloud Platforms
>
>
>
>
> On Fri, 1 Mar 2019, Jeremy T. Bouse wrote:
>
>> I ended up with the following NGINX configuration...
>> in /etc/nginx/conf.d/upstream.conf:
>> upstream sks_secondary {
>>     server 127.0.0.1:11371 weight=5;
>>     server 172.16.20.52:11371 weight=10;
>>     server 172.16.20.53:11371 weight=10;
>>     server 172.16.20.54:11371 weight=10;
>> }
>> upstream sks_primary {
>>     server 127.0.0.1:11371;
>>     server 172.16.20.52:11371 backup;
>>     server 172.16.20.53:11371 backup;
>>     server 172.16.20.54:11371 backup;
>> }
>> map $arg_op $sks_server {
>>     "stats" sks_primary;
>>     default sks_secondary;
>> }
>> in /etc/nginx/site-available/sks-default:
>> server {
>>     listen 172.16.20.51:80 default_server;
>>     listen 172.16.20.51:11371 default_server;
>>     listen [::]:80 ipv6only=on default_server;
>>     # listen [::]:11371 ipv6only=on default_server;
>>     access_log off;
>>     server_tokens off;
>>     root   /var/www/html;
>>     index  index.html index.htm;
>>     proxy_buffering on;
>>     proxy_buffer_size 1k;
>>     proxy_buffers 24 4k;
>>     proxy_busy_buffers_size 8k;
>>     proxy_max_temp_file_size 2048m;
>>     proxy_temp_file_write_size 32k;
>>     location /pks/hashquery {
>>         proxy_ignore_client_abort on;
>>         proxy_pass http://$sks_server;
>>         proxy_set_header Host $host;
>>         proxy_set_header X-Real-IP $remote_addr;
>>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>>         proxy_set_header X-Forwarded-Proto $scheme;
>>         proxy_set_header X-Forwarded-Port $server_port;
>>         proxy_pass_header Server;
>>         add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
>>     }
>>     location /pks/add {
>>         proxy_ignore_client_abort on;
>>         proxy_pass http://$sks_server;
>>         proxy_set_header Host $host;
>>         proxy_set_header X-Real-IP $remote_addr;
>>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>>         proxy_set_header X-Forwarded-Proto $scheme;
>>         proxy_set_header X-Forwarded-Port $server_port;
>>         proxy_pass_header Server;
>>         add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
>>         client_max_body_size 8m;
>>     }
>>     location /pks {
>>         proxy_cache ugns_sks_cache;
>>         # proxy_cache_background_update on;
>>         proxy_cache_lock on;
>>         proxy_cache_min_uses 3;
>>         proxy_cache_revalidate on;
>>         proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
>>         proxy_cache_valid any 10m;
>>         proxy_ignore_client_abort on;
>>         proxy_ignore_headers Cache-Control "Expires";
>>         proxy_pass http://$sks_server;
>>         proxy_set_header Host $host;
>>         proxy_set_header X-Real-IP $remote_addr;
>>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>>         proxy_set_header X-Forwarded-Proto $scheme;
>>         proxy_set_header X-Forwarded-Port $server_port;
>>         proxy_pass_header Server;
>>         add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
>>         add_header X-Proxy-Cache $upstream_cache_status;
>>     }
>> }
>> The NGINX configuration appears to be working fine for me... My 3 backend nodes are operating as I expect as well.. The problem I'm seeing exhibited currently is that my primary node which is running along with NGINX seems to be
>> writing an awful lot of log files and not processing them despite having the same DB_CONFIG file as the other 3 nodes. With each of the log files running 100MB it is quickly filling up the drive on that node and then the sks and
>> sks-recon processes simply fall over and crash but the other 3 nodes running behind it keep chugging along just not getting any further gossip input.
>> I keep seeing the following log entry popping up only on my primary node:
>>     add_keys_merge failed: Eventloop.SigAlarm
>> On 2/25/2019 12:37 PM, Todd Fleisher wrote:
>>            On Feb 23, 2019, at 8:35 PM, Jeremy T. Bouse <[hidden email]> wrote:
>> I didn't have as many locations configured as you show in your example but it looked like you were defining the map but I didn't see it being used in any of your location blocks unless I'm missing something. Shouldn't you
>> be using $upstream_server in your proxy_pass configuration?
>> I think you’re on to something here. I just tried commenting out the other servers from the upstream sks_servers_primary block and am still seeing stats queries hitting the commented out servers.
>> Kristian - could you please double check the configuration snippets you provided to me last year and see if something was missing related to this?
>> -T

_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Jeremy T. Bouse
In reply to this post by Jonathon Weiss
I'd previously had only 2 instances and if they weren't peering outside
and one went down it seemed to cause problems. I went with 3 backend
secondary nodes with the primary node doing the peering outside my
network this time since I was re-deploying from scratch. This way I can
take 1 node out and the cluster still has 3 active nodes. I do have my
primary shutting down to perform key dumps and I also have more of the
traffic for queries go to the secondary nodes and leave the primary node
to handle the peering recon. In fact I just dumped the keys from my
primary node and rebuilt all 4 nodes from the dump last night.

On 3/4/2019 6:18 PM, Jonathon Weiss wrote:

> Thanks for the information everyone.  A further question:  I saw the
> advice of a minimum of three servers.  Anyone know how that was
> arrived at, or if there is a recommendation on how many queries an
> individual SKS back-end can handle?
>
>     Jonathon
>
>     Jonathon Weiss <[hidden email]>
>     MIT/IS&T/Cloud Platforms
>
>
>
>
> On Fri, 1 Mar 2019, Jeremy T. Bouse wrote:
>
>>
>> I ended up with the following NGINX configuration...
>>
>> in /etc/nginx/conf.d/upstream.conf:
>>
>> upstream sks_secondary {
>>     server 127.0.0.1:11371 weight=5;
>>     server 172.16.20.52:11371 weight=10;
>>     server 172.16.20.53:11371 weight=10;
>>     server 172.16.20.54:11371 weight=10;
>> }
>>
>> upstream sks_primary {
>>     server 127.0.0.1:11371;
>>     server 172.16.20.52:11371 backup;
>>     server 172.16.20.53:11371 backup;
>>     server 172.16.20.54:11371 backup;
>> }
>>
>> map $arg_op $sks_server {
>>     "stats" sks_primary;
>>     default sks_secondary;
>> }
>>
>> in /etc/nginx/site-available/sks-default:
>>
>> server {
>>     listen 172.16.20.51:80 default_server;
>>     listen 172.16.20.51:11371 default_server;
>>     listen [::]:80 ipv6only=on default_server;
>>     # listen [::]:11371 ipv6only=on default_server;
>>     access_log off;
>>     server_tokens off;
>>     root   /var/www/html;
>>     index  index.html index.htm;
>>
>>     proxy_buffering on;
>>     proxy_buffer_size 1k;
>>     proxy_buffers 24 4k;
>>     proxy_busy_buffers_size 8k;
>>     proxy_max_temp_file_size 2048m;
>>     proxy_temp_file_write_size 32k;
>>
>>     location /pks/hashquery {
>>         proxy_ignore_client_abort on;
>>         proxy_pass http://$sks_server;
>>         proxy_set_header Host $host;
>>         proxy_set_header X-Real-IP $remote_addr;
>>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>>         proxy_set_header X-Forwarded-Proto $scheme;
>>         proxy_set_header X-Forwarded-Port $server_port;
>>         proxy_pass_header Server;
>>         add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
>>     }
>>
>>     location /pks/add {
>>         proxy_ignore_client_abort on;
>>         proxy_pass http://$sks_server;
>>         proxy_set_header Host $host;
>>         proxy_set_header X-Real-IP $remote_addr;
>>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>>         proxy_set_header X-Forwarded-Proto $scheme;
>>         proxy_set_header X-Forwarded-Port $server_port;
>>         proxy_pass_header Server;
>>         add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
>>         client_max_body_size 8m;
>>     }
>>
>>     location /pks {
>>         proxy_cache ugns_sks_cache;
>>         # proxy_cache_background_update on;
>>         proxy_cache_lock on;
>>         proxy_cache_min_uses 3;
>>         proxy_cache_revalidate on;
>>         proxy_cache_use_stale error timeout updating http_500
>> http_502 http_503 http_504;
>>         proxy_cache_valid any 10m;
>>         proxy_ignore_client_abort on;
>>         proxy_ignore_headers Cache-Control "Expires";
>>         proxy_pass http://$sks_server;
>>         proxy_set_header Host $host;
>>         proxy_set_header X-Real-IP $remote_addr;
>>         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>>         proxy_set_header X-Forwarded-Proto $scheme;
>>         proxy_set_header X-Forwarded-Port $server_port;
>>         proxy_pass_header Server;
>>         add_header Via "1.1 sks.undergrid.net:$server_port (nginx)";
>>         add_header X-Proxy-Cache $upstream_cache_status;
>>     }
>> }
>>
>> The NGINX configuration appears to be working fine for me... My 3
>> backend nodes are operating as I expect as well.. The problem I'm
>> seeing exhibited currently is that my primary node which is running
>> along with NGINX seems to be
>> writing an awful lot of log files and not processing them despite
>> having the same DB_CONFIG file as the other 3 nodes. With each of the
>> log files running 100MB it is quickly filling up the drive on that
>> node and then the sks and
>> sks-recon processes simply fall over and crash but the other 3 nodes
>> running behind it keep chugging along just not getting any further
>> gossip input.
>>
>> I keep seeing the following log entry popping up only on my primary
>> node:
>>
>>     add_keys_merge failed: Eventloop.SigAlarm
>>
>> On 2/25/2019 12:37 PM, Todd Fleisher wrote:
>>             On Feb 23, 2019, at 8:35 PM, Jeremy T. Bouse
>> <[hidden email]> wrote:
>>
>> I didn't have as many locations configured as you show in your
>> example but it looked like you were defining the map but I didn't see
>> it being used in any of your location blocks unless I'm missing
>> something. Shouldn't you
>> be using $upstream_server in your proxy_pass configuration?
>>
>> I think you’re on to something here. I just tried commenting out the
>> other servers from the upstream sks_servers_primary block and am
>> still seeing stats queries hitting the commented out servers.
>> Kristian - could you please double check the configuration snippets
>> you provided to me last year and see if something was missing related
>> to this?
>>
>> -T

_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Kristian Fiskerstrand-6
In reply to this post by Todd Fleisher
On 2/25/19 6:37 PM, Todd Fleisher wrote:

>> On Feb 23, 2019, at 8:35 PM, Jeremy T. Bouse
>> <[hidden email] <mailto:[hidden email]>> wrote:
>>
>> I didn't have as many locations configured as you show in your example
>> but it looked like you were defining the map but I didn't see it being
>> used in any of your location blocks unless I'm missing something.
>> Shouldn't you be using $upstream_server in your proxy_pass configuration?
>>
> I think you’re on to something here. I just tried commenting out the
> other servers from the upstream sks_servers_primary block and am still
> seeing stats queries hitting the commented out servers.
>
> Kristian - could you please double check the configuration snippets you
> provided to me last year and see if something was missing related to this?
don't recall the specifics of the snippets I sent over, but the above is
correct, you do map definition at
map $arg_op $upstream_server {
        "stats" sks_servers_primary;
        default sks_servers;
}

which is used for

        location /pks/lookup {
    proxy_cache backcache;
        proxy_ignore_headers Cache-Control "Expires";

    #proxy_cache_bypass $http_cache_control;
    add_header X-Proxy-Cache $upstream_cache_status;
    proxy_cache_valid any 10m;
                proxy_pass http://$upstream_server;
                proxy_pass_header  Server;
                add_header Via "1.1 keys2.kfwebs.net";
                proxy_ignore_client_abort on;
        }
}

--
----------------------------
Kristian Fiskerstrand
Blog: https://blog.sumptuouscapital.com
Twitter: @krifisk
----------------------------
Public OpenPGP keyblock at hkp://pool.sks-keyservers.net
fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3
----------------------------
Corruptissima re publica plurimæ leges
The greater the degeneration of the republic, the more of its laws


_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SKS scaling configuration

Todd Fleisher
Yeah, I thought it looked accurate. Attached is the full config for reference. I’m still seeing issues where nginx frequently caches stats data from one of the non-primary nodes even when I verify the primary node is responding when I query it directly on it’s internal 10-net IP address. It’s puzzling for sure, but has taken a back seat to trying to keep my servers stable lately as they’ve been experiencing sustained periods of very high IO usage that are causing instability. If anyone has nginx configurations to help combat the bad key-induced instability, please do share.




-T

> On Mar 5, 2019, at 1:20 AM, Kristian Fiskerstrand <[hidden email]> wrote:
>
> don't recall the specifics of the snippets I sent over, but the above is
> correct, you do map definition at
> map $arg_op $upstream_server {
>        "stats" sks_servers_primary;
>        default sks_servers;
> }
>
> which is used for
>
>        location /pks/lookup {
>    proxy_cache backcache;
>        proxy_ignore_headers Cache-Control "Expires";
>
>    #proxy_cache_bypass $http_cache_control;
>    add_header X-Proxy-Cache $upstream_cache_status;
>    proxy_cache_valid any 10m;
>                proxy_pass http://$upstream_server;
>                proxy_pass_header  Server;
>                add_header Via "1.1 keys2.kfwebs.net";
>                proxy_ignore_client_abort on;
>        }
> }
>
> --
> ----------------------------
> Kristian Fiskerstrand
> Blog: https://blog.sumptuouscapital.com
> Twitter: @krifisk
> ----------------------------
> Public OpenPGP keyblock at hkp://pool.sks-keyservers.net
> fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3
> ----------------------------
> Corruptissima re publica plurimæ leges
> The greater the degeneration of the republic, the more of its laws
>

_______________________________________________
Sks-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/sks-devel

nginx.conf (4K) Download Attachment
signature.asc (849 bytes) Download Attachment